Sunday, June 23, 2013

Zend opcode cacher in PHP 5.5: a security perspective

The new zend opcache extension built into the latest version of PHP is, as we've checked, a bit faster than the current best - XCache. There's, however, a security concern.

Both extensions make it possible to clear the whole cache programatically (from within a script). In Zend it's the opcache_reset() function and in XCache it's xcache_clear_cache(XC_TYPE_PHP).

If you run shared hosting, you allow your users to execute arbitrary code in your PHP interpreter. This, naturally, includes the two functions mentioned above. If you rely on the opcode cache to provide desired level of service for your customers, you don't want them to clear everybody's caches. If a malicious or badly written script clears your cache, say, every minute - there is no benefit from caching. The whole idea is that you cache the most used files once, and use the cached version from RAM. What this means for a shared host is increased CPU and possibly disk IO usage. In some cases, this can turn into a Denial of Service (DoS) attack.

So let's take this simple script and see what Zend Opcache will do when you execute it:

<?php

error_reporting(E_ALL);
ini_set('display_errors', 1);

opcache_reset();

echo 'OK';

Well, Zend happily cleared the cache. Now let's see what XCache does:

<?php

error_reporting(E_ALL);
ini_set('display_errors', 1);

xcache_clear_cache(XC_TYPE_PHP);

echo 'OK';

The XCache developer envisioned this problem. Certain functions, which affect other users' experience, trigger a login page. The username and password is set by an administrator system-wide in php.ini and is not viewable by regular users.

Conclusion

If you run a shared host and want to speed it up using opcode caching, you should use XCache rather than the new Zend Opcache. Zend is only better by 10%, it's not a number worth risking a DoS attack. On the other hand, if you and only you control the code that's run on your servers, this security issue won't affect you.

Possible fix

You could (and should) use the disable_functions directive in php.ini

(Need help installing XCache or securing Zend?)

PHP 5.5: Zend Optimiser+ OPcache vs XCache

PHP 5.5 has a new feature: built-in opcode cache! It's an alternative for extensions like XCache, APC or eAccelerator. Let's see how it performs compared to established solutions.

Test environment

Server: OpenVZ container running on 1 Xeon W3520 CPU core, high I/O priority (to ensure CPU-bound execution), 1 GB RAM. Requests originating from another VPS on that system.

Software: Debian 7.0 i386, nginx 1.2.1, PHP 5.5.0 from dotdeb, XCache SVN trunk from 2013-06-23 (r1269).

(This test may or may not reflect what is happening on dedicated servers. Unfortunately, we don't have a spare server-grade physical machine at the moment. That being said, the machine which held the VPS is under a small network and IO load.

PHP-FPM configuration

pm = dynamic
pm.max_children = 25
pm.start_servers = 4
pm.min_spare_servers = 2
pm.max_spare_servers = 10
pm.max_requests = 5000

Zend Opcache has been configured as following:

root@opcode:/var/www# grep ^opcache /etc/php5/fpm/php.ini
opcache.enable=1
opcache.enable_cli=1
opcache.memory_consumption=128
opcache.interned_strings_buffer=4
opcache.max_accelerated_files=3907
opcache.fast_shutdown=1

XCache has been configured as following:

xcache.size = 128M
xcache.count = 3
xcache.cacher = On

Joomla

Joomla! 3.1.1. Installation was tricky, installation/application/framework.php had to be edited, this line in particular: ini_set('display_errors', true); - otherwise AJAX calls didn't succeed because JSON has been polluted with error messages.

The "Default English" sample data set has been installed. Joomla cache has been disabled because it's not what we want to test.

Test

The Apache Benchmark has been used to measure average response time and number of requests served per seconds. The command line was:

ab -n 10000 -c 20 

Ten thousands requests have been performed by 20 concurrent threads. This corresponds with the pm.max_children=25 setting in php.ini.

No cache

Performance is clearly CPU-bound. CPU usage is 100%, RAM utilization is about 880 MB out of 1GB. Disk I/O is negligible measured from both within the container (%wa 0) and the hypervisor (%wa 7).

Apache Benchmark output (Unfold)

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 176.31.28.48 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests


Server Software:        nginx/1.2.1
Server Hostname:        176.31.28.48
Server Port:            80

Document Path:          /
Document Length:        8626 bytes

Concurrency Level:      20
Time taken for tests:   1743.994 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      89890000 bytes
HTML transferred:       86260000 bytes
Requests per second:    5.73 [#/sec] (mean)
Time per request:       3487.989 [ms] (mean)
Time per request:       174.399 [ms] (mean, across all concurrent requests)
Transfer rate:          50.33 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:   656 3483 1020.6   3166    9587
Waiting:      403 2999 778.5   2812    8757
Total:        656 3483 1020.6   3166    9587

Percentage of the requests served within a certain time (ms)
  50%   3166
  66%   3528
  75%   3849
  80%   4091
  90%   4797
  95%   5540
  98%   6695
  99%   7420
 100%   9587 (longest request)

Zend OPcache enabled

It looks like the CPU hasn't been fully utilized with 20 concurrent requests. The usage fluctuated between 70% and 90%. I/O and memory usage were, as previously, within reasonable limits.

Apache Benchmark output (Unfold)

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 176.31.28.48 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests


Server Software:        nginx/1.2.1
Server Hostname:        176.31.28.48
Server Port:            80

Document Path:          /
Document Length:        8626 bytes

Concurrency Level:      20
Time taken for tests:   658.408 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      89890000 bytes
HTML transferred:       86260000 bytes
Requests per second:    15.19 [#/sec] (mean)
Time per request:       1316.817 [ms] (mean)
Time per request:       65.841 [ms] (mean, across all concurrent requests)
Transfer rate:          133.33 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:   421 1316 426.2   1214    5664
Waiting:      275 1053 355.8    986    4329
Total:        421 1316 426.2   1214    5664

Percentage of the requests served within a certain time (ms)
  50%   1214
  66%   1333
  75%   1448
  80%   1537
  90%   1826
  95%   2065
  98%   2484
  99%   2780
 100%   5664 (longest request)

XCache

Just like previously, the load has been CPU-bound even after enabling XCache.

Apache Benchmark output (Unfold)

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 176.31.28.48 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests


Server Software:        nginx/1.2.1
Server Hostname:        176.31.28.48
Server Port:            80

Document Path:          /
Document Length:        8626 bytes

Concurrency Level:      20
Time taken for tests:   769.788 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      89890000 bytes
HTML transferred:       86260000 bytes
Requests per second:    12.99 [#/sec] (mean)
Time per request:       1539.576 [ms] (mean)
Time per request:       76.979 [ms] (mean, across all concurrent requests)
Transfer rate:          114.04 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:   601 1538 431.1   1439    4401
Waiting:      393 1232 361.6   1178    4205
Total:        601 1539 431.1   1440    4401

Percentage of the requests served within a certain time (ms)
  50%   1440
  66%   1582
  75%   1696
  80%   1784
  90%   2079
  95%   2333
  98%   2847
  99%   3179
 100%   4401 (longest request)

Graphs

Time graph: less is better

Requests per second graph: more is better

Comment

Well, XCache is no more the best opcode cache around. We're looking forward to widespread adoption of PHP 5.5+. Unfortunately, there is currently no GUI for nicely analyzing cache usage, like XCache has. However, Zend exposes a new function, accelerator_get_status(), which could be used to create such a GUI - so certainly somebody will release relevant GUI in the next weeks.

EDIT: We didn't have to wait long - as a reader pointed out, there is a GUI for Zend Optimizer.

As usual, if you need help, we will make your website fast for you.

Wednesday, June 12, 2013

Pure-PHP Celery connector

Our contribution to the open source community, the Celery-PHP library, had one drawback: it needed a PECL extension that was pretty hard to compile, and not shipped with the most popular distributions. I can only imagine how hard it must have been for some people to set up a working environment.

So during the last month the library has undergone massive refactoring to abstract away the parts responsible for connecting to a queue. This allowed to create pluggable queue backend objects. Right now two are implemented: "pecl", the old code utilizing a PECL extension, and "php-amqplib", using a pure PHP AMQP library.

If you've used Celery-PHP before, you won't notice any difference. The library will just default to using PECL.

Here's how it works: I've added an extra parameter to the constructor of Celery class. If not supplied, it triggers an auto-detection system which first looks for the PECL extension, then checks if PHP-amqplib has been installed by composer, and gives up if neither of these is present. You can force the queue type by passing a string - currently 'pecl' and 'php-amqplib' are accepted.

The new PHP-AMQPLib backend passed most unit tests, but the ones checking how it behaves on long running tasks. The problem is that AMQPLib doesn't allow specifying an arbitrary interval when waiting for a new message, it's always an integer number of seconds. For this reason some tests don't behave as expected, but it shouldn't affect real-life code other than that "asynchronous" checks for result take a whole second. I've prepared a pull request to address this problem, but the developer didn't decide to pull it in yet.

If you want to try it out, pull the code from its branch on github. It will be available in the main branch once it's well tested and the php-amqplib pull request goes through (so vote for it if you want to use PECL-less Celery!).