PHP regular expression functions fail on GoDaddy shared hosting

While testing some crawler script on GoDaddy shared hosting I noticed that the script is quitting w/o any notice at random points. Both web and CLI execution modes where affected. The script was previously tested on XAMPP server where it  worked fine.

Lately, I identified that script always quits after calling one of regular expression functions (PRCE) like preg_replace, preg_match and preg_match_all. The script called them hundreds of times and one of the calls became fatal.

UPDATE: Actually it appears to be some kind of general problem with long string operations. But switching to multi-byte string regular expression functions helped in most scenarios.

I wrote a simple proof-of-concept script:

set_time_limit(0);

$a = str_repeat('pattern', 100000);

$i=0;
while(1) {
    preg_replace('/pattern/is', 'new', $a);
    echo("{$i}<br />\n"); flush();
    $i++;
}

The script runs infinitely on my local server but exits after 400-500 iterations on GoDaddy shared hosting. I have no idea what causes it.

Fortunately, there is an ability to switch to multibyte string functions like mb_erg_replace, mb_ereg_search_init and mb_ereg_search_regs. After replacing every call to PRCE function with it’s equivalent from multibyte extension the works like a charm.

Unfortunately, multibyte regular expression functions aren’t compatible with PRCE functions, so it’ll be a lot of work to edit every function call. I recommend to define a wrapper functions that will accept parameters just like PRCE functions.

Leave a Comment