Beware: stream_copy_to_stream and Zend_Http_Client_Adapter_Socket may hang on old PHP 5.2.x

Recently we found that our Zend Framework based application was running into infinite loop and terminated by execution timeout on some hostings. The problem was found in Zend_Http_Client_Adapter_Socket class which uses stream_copy_to_stream if you configure Zend_Http_Client for writing data to stream.

The problem already was reported on ZF issue tracker but wasn’t fixed: http://framework.zend.com/issues/browse/ZF-9265.
It seems that the cause is a bug in stream_copy_to_stream that was fixed at some point during PHP 5.2.x development.

But as we need to run our code on virtually any hosting we decided to work around this problem by replacing stream_copy_to_stream with fread and fwrite in Zend_Http_Client_Adapter_Socket code.
Notice that Zend_Http_Client_Adapter_Curl is not affected by this problem as it uses internal code for writing to streams. Thus, switching to Zend_Http_Client_Adapter_Curl sounds as the easiest solution. We added automatic switching to it by checking if ‘curl_init’ function exists and if so we use Zend_Http_Client_Adapter_Curl as the adapter for Zend_Http_Client.

How to set InnoDB as a default storage engine for MySQL tables

I use InnoDB storage engine because of support for transactions and referral integrity rules. However, MySQL still creates new tables as MyISAM by default. It was so annoying  to always define storage engine when creating new tables and double check that I didn’t forget it until I found how to set InnoDB by default.

Read more

Why MySQL timestamp is 24 seconds different from PHP

You may find the timestamp value returned by MySQL UNIX_TIMESTAMP() function 24 seconds grater than that returned by PHP functions and methods like strtotime(), mktime(), DateTime::getTimestamp(), Zend_Date::getTimestamp().

Read more

PHP regular expression functions fail on GoDaddy shared hosting

While testing some crawler script on GoDaddy shared hosting I noticed that the script is quitting w/o any notice at random points. Both web and CLI execution modes where affected. The script was previously tested on XAMPP server where it  worked fine.

Lately, I identified that script always quits after calling one of regular expression functions (PRCE) like preg_replace, preg_match and preg_match_all. The script called them hundreds of times and one of the calls became fatal.

UPDATE: Actually it appears to be some kind of general problem with long string operations. But switching to multi-byte string regular expression functions helped in most scenarios.

Read more

Rewriting for SEO-Friendly URLs: .htaccess or PHP?

Modern database driven web sites implement SEO-friendly URLs emulating static directories and files. Switching to such “clean” URLs enables good indexing by search engines, makes URLs more user-friendly and hides the server-side language. For example, this clean URL may refer to the page in some product directory:

http://somesite.com/products/network/router.html

In fact, there is no /products/network folder on the server and no router.html file at all. The page is generated by server script using database query for “network” product category and “router” product. But who calls the script and where it gets the query parameter values?

This technique is usually referred as “URL rewriting”. It allows web server to recognize what information was requested by parsing the URL string. Apache and PHP allow multiple options to implement URL rewriting. So which one is the best?

Read more

PHP regular expressions and UTF-8

Perl-compatible regular expression functions in PHP can properly work with Unicode strings. Just add /u modifier to turn on UTF-8 support in preg_replace, preg_match, preg_match_all, preg_split and other PCRE (preg) functions. This way you can parse strings with national characters. For example:

$clean = preg_replace('/\s\s+/u', ' ', $dirty);

If used without /u modifier this code damages UTF-8 encoded strings by replacing national character bytes improperly interpreted as whitespace characters. This and many other problems are caused by improper interpretation of every byte as ASCII character which is not always true for UTF-8.

The modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.
I found this tip as well as many other useful info on regular-expressions.info. It’s not easy to find it in the PHP documentation but it’s actually hidden here.

SEO-friendly URLs and relative links

The Web community is going crazy about SEO-friendly URLs like http://somesite.com/products/network/router/. Well, it looks much better than a script URL http://somesite.com/products.php?c=network&p=router which may actually serve the page behind the scenes. There are a lot of good articles on how to implement SEO-friendly URLs, for example this one or my own post. But they do not warn the reader about one usual problem: once you have updated your site to handle virtual paths you will probably get a bad surprise:

CSS, image and internal page links are totally broken!

Why? Because those links are usually relative to the page location. The browser has no idea about virtual folders and tries to get files from locations relative to the page URL context. For example, if there is a usual CSS link in the page header:

<link rel="stylesheet" href="style.css" type="text/css" media="screen" />

Then the browser will try to download non-existing file http://somesite.com/products/network/router/style.css and fail silently. No CSS style will be applied.

It’s incredible how many words were spoken about SEO-friendly URLs with almost no word about this relative link problem.
So, what you have to do? Don’t worry, there are multiple solutions available and I’ll try to explain them all.

Read more

Reading, writing and converting RSA keys in PEM, DER, PUBLICKEYBLOB and PRIVATEKEYBLOB formats

This post finishes my epic about the implementation of RSA encryption. See the part I and part II of my post about RSA encryption for C++/Delphi (CryptoAPI) and PHP (OpenSSL) applications.

The main problem we faced was incompatibility of key formats. CryptoAPI uses PRIVATEKEYBLOB and PUBLICKEYBLOB formats to export and import RSA keys while OpenSSL extension for PHP uses PEM format. In order to use both libraries in communicating applications we needed some tool to convert keys from one format to another. The only tool we found for this was OpenSSL 1.0.x beta. Notice that earlier versions of OpenSSL do not support CryptoAPI BLOBs.

Update: It was found later that CryptoAPI has native functions for key conversion. See “Update” section at the bottom of the post.

Below is a command line syntax example for conversion of private key from PEM to PRIVATEKEYBLOB format:

openssl rsa -inform PEM -in private.pem -outform MS\ PRIVATEKEYBLOB -out private.blob

And this example converts PUBLICKEYBLOB to PEM format:

openssl rsa -pubin -inform MS\ PUBLICKEYBLOB -in public.blob -outform PEM -out public.pem

Notice that backslash (\) in format names. You need to type it as it actually escapes the space character.

However, we found some drawbacks in usage of OpenSSL 1.0.x beta:

  • There was no Windows build of it available at the time of the post but we wanted to convert keys on Windows.
  • We also wanted to convert keys directly in our code w/o any need for external application.

As far as PRIVATEKEYBLOBPUBLICKEYBLOB and PEM format structures are known, we decided to develop code that will read and write them using low-level functions. It actually took 1-2 days for me to develop that code so I don’t think it’s a really hard task.

Later we faced another problem: PHP versions prior to 5.2 don’t support openssl_pkey_get_details function. Once again, handling key formats directly helped us to resolve the issue by providing a replacement for the function.

So, let me explain how you can implement reading/writing PEM, DER, PRIVATEKEYBLOB and PUBLICKEYBLOB formats with some code examples in PHP for PEM and DER formats and in C++/VCL for CryptoAPI BLOBs. As the task was a part of a commercial project I cannot post a complete working example here. But I will do my best helping you to assemble such code on your own. You can also request our service at Pumka.net.

Read more

RSA encryption for C++/Delphi (CryptoAPI) and PHP (OpenSSL) [part 2]

In my previous post I explained that we needed to encrypt a communication messages between Windows C++/VCL client and PHP based web service. We cannot use SSL and decided to use RSA encryption with the help of low-level functions provided by CryptoAPI at the client side and OpenSSL PHP extension at the server.

We also faced and resolved the key incompatibility problem. See my post about this.

In this post I will describe implementation or RSA encryption/decryption and digital signing.

Read more

RSA encryption for C++/Delphi (CryptoAPI) and PHP (OpenSSL) [part 1]

This post provides an overview of RSA encryption implementation. Please, read my next post for detailed guidelines and code examples.

The purpose of this project was to protect communication between C++ (VCL) client application and PHP server script with encrypting and digital signing HTTP requests and responses.

Well, the simplest solution for the project task is a usage of SSL (HTTPS). However, this project is targeting shared hosting users that cannot afford HTTPS or SSL certificates.
That’s why we considered a possible usage of GnuPG but abandoned it as hard to implement. Instead, we decided to base our solution directly on the RSA algorithm PGP is based on.

At the beginning of the project I had no idea about cryptography and RSA in particular. I’m still not very familiar with it.
Thanks to the Wikipedia you can read all you need to know about RSA in a single place: http://en.wikipedia.org/wiki/RSA.

What I knew at the beginning is that cryptographic libraries provide tools for RSA encryption, decryption, signing and verification. Thus, I consider this project as a good work. However, it turns into completely nightmare.
Why? Because of incompatibility.
Thanks to standardization of SSL (TLS) all cryptographic libraries are compatible at the top level and can communicate without a problem. However, their low-level functions and key formats are just not compatible.

The conclusion is not very weird:

  • If you can use SSL at both sides then better to use it.
  • If you can use the same cryptographic library at the client and the server then use it.
  • If you’re going to use low-level functions of different libraries then prepare to have a very hard work dealing with incompatibility.

In this post I will cover main issues and conclusions made during the development. I’ll give more details and code examples in the second part of the post.

Read my next post for details and code examples on implementing RSA encryption/decryption and digital signing.

Read more