Blursed async PHP: Dumb, but also kind of fun
PHP is a fairly forgiving programming language. It doesn’t force you to learn about things like compilation steps, deployment strategies, concurrency, and proper memory management. If you only need a simple web page with some dynamic behaviour, all you have to do is write a PHP script and upload it somewhere!
Of course, as you move on to more complex applications you start to run into some of its limitations. The most jarring one is the fact that there is no straightforward way to run time-consuming parts of your code asynchronously.
For example, let’s say we have a PHP script
slow.php which includes a function
do_something() that takes a whopping 5 seconds to complete:
We can include
slow.php and call
do_something() from another script (let’s
name this one
As expected, this gives us the following output:
sync.php is pretty slow. It’s not just
takes five seconds: everything takes five seconds. Someone who tries to load
sync.php doesn’t see anything until the entire script has finished its execution.
This makes the first line in the output more of a log than a notification or
announcement that tells the user what is happening.
Since users probably aren’t as forgiving as PHP, we have to “offload” heavy tasks to something else that can run them in the background. There are various widely used solutions, including:
Long-running PHP processes based on Symfony Messenger, Laravel’s queues or “homemade” code, which can process queued tasks as jobs (or messages). This is done synchronously, but since they run separate from the main application this doesn’t affect execution times of HTTP requests.
Gearman is an application framework that helps you distribute tasks to other computers. It’s kind of like the first solution, except that Gearman is so ancient that its website isn’t even served over HTTPS.
But what if you are on shared hosting and can’t use any of these solutions?
If you’re lucky, your hosting provider lets you use functions like
system(), which make it possible to execute Linux commands
from your PHP script.
Here we have another script that we’ll name
Rather than including
slow.php and calling the
directly, we execute the script using
Like all other functions in PHP,
exec() is synchronous. But that doesn’t matter,
because we’ve appended
> /dev/null & to our command. This tells the operating
system that the command should run in the background and we’re not interested in
– and thus don’t want to wait for – any of its output.
A user who accesses the script above from a web browser therefore only sees the following:
This response appears almost immediately! We can see in the server logs that
do_something() finished about 5 seconds after the request to
Many hosting providers disable the use of functions like
exec() for security
reasons. Fortunately, there are other options.
Here we have another script,
headers.php, that uses HTTP headers to create the
do_something() is called asynchronously:
HTTP headers can be used to provide invisible instructions to browsers or the
web server. The
Location header for example can be used to transparently
redirect a browser to another web page. But there are many other possible
headers.php includes three such headers:
Connection: closetells the browser that the server wants to close the connection. From the user’s perspective, this means that the browser no longer shows a loading spinner, and the “stop” button turns into a “refresh” button.
Content-Lengthtells the browser how much data it can expect before the connection can be closed. Note that the value is based on the length of
$output. This means that any content that is generated after the “Bye!” will not be visible in the browser.
Web servers are often configured to compress responses for performance reasons. In this case it means that our calculated
Content-Lengthis likely larger than what the server actually sends.
Content-Encoding: nonedisables compression for this request, so that the connection is closed at the right moment.
flush() makes sure
that the headers and
$output are sent to the browser immediately. The
script doesn’t terminate yet, because it still needs to
do_something() – but
as far as the user is concerned, it has finished.
The output of
headers.php is very similar to that of
do_something() was called directly in the script. However, the output generated
by that function can only be found in the server logs:
In the first two solutions,
do_something() was called as quickly as possible.
While this has its advantages, they can easily overload a server and bring
everything to a grinding halt.
There’s a third solution that is safer and more popular: cron-based task execution.
cron is a utility on that
lets you schedule commands that need to run periodically, e.g. every minute.
do_something() needs to be executed for each request – not periodically.
So how does this help us?
Well, we can queue
do_something() invocations. The following script (
in the companion repository)
shows how we can queue invocations:
Each time we receive a request, we store some metadata about the task that we
want to execute asynchronously. Here we do that by creating files in
directory for temporary files. It doesn’t really matter how and where you store
the metadata, . You can also use a database
if you have one!
queue_job() writes “jobs” to a “queue”, but we still need a script (
that can process them:
By configuring this script in a so-called crontab, we can make sure that it is executed every minute:
Each time the script runs it will look for “jobs” in the
and invoke the
do_something() function for each of them. This is done
synchronously and consecutively, so if
/tmp/jobs contains a large number of
The output in the browser for
cron-trigger.php should look quite familiar by now:
The results of
do_something() should be visible in the cron logs:
Of course, this solution isn’t perfect either. You probably need to make sure that jobs are processed in the correct order, exactly once, and don’t have to wait too long.
Choose your poison. ;-)