Laravel Distributed Maintenance Mode

The Laravel maintainers have begun to invest more time in providing features and tooling for managing Laravel in a distributed environment.

Laravel 5.4 modified the withoutOverlapping() option for kernel tasks (PR #16196) to use the cache to create a lock and prevent overlapping scheduled tasks across instances.

Laravel Horizon makes it easier to monitor and maintain Laravel powered Redis queues in both a distributed and single instance deployment.

Laravel 5.5 includes TrustedProxy so that you can more easily place your application behind a load balancer.

What is still missing, is the ability to put the application into maintenance mode when running multiple Laravel instances behind a load balancer.

If you don’t need to put your app in maintenance mode, but simply need to prevent people from accessing the site for a short time, the easiest option might be to point your domain to a static maintenance page for a while. For example, you could use Route53 to redirect traffic to an s3 bucket containing a maintenance page.

Why Do We Need Maintenance Mode

Maintenance mode displays a maintenance page to visitors as well as prevents queued jobs from running.

Therefore, if you are using the Laravel job queue or scheduled tasks, simply pointing your domain to a static maintenance page is not an ideal solution, as background jobs will continue to run and interact with your database and cache.

If you have unexpected long running migration, or you need to perform emergency maintenance on a Laravel deployment that has background tasks, re-routing traffic to a static maintenance page is not an ideal solution.

How Does Maintenance Mode Work

The built in php artisan down command that puts the app into maintenance mode, places a temporary file in the app’s storage directory at /framework/down. You can find the code for the DownCommand code here.

When a request is processed by Laravel, it is either handled by the HTTP or console kernel.

The HTTP kernel is generally the bulk of your application requests and passes through the global HTTP middleware and middleware groups before being sent to your routing files.

The console kernel is used for any scheduled tasks you define.

Both kernels will pass through a CheckForMaintenanceMode middleware that checks if the application is currently in maintenance mode, which then throws an exception that is handled appropriately in the app\Excpetions\Handler.php file.

/**
 * Handle an incoming request.
 *
 * @param  \Illuminate\Http\Request  $request
 * @param  \Closure  $next
 * @return mixed
 *
 * @throws \Symfony\Component\HttpKernel\Exception\HttpException
 */
 public function handle($request, Closure $next)
 {
      if ($this->app->isDownForMaintenance()) {
        $data = json_decode(file_get_contents($this->app->storagePath().'/framework/down'), true);
        throw new MaintenanceModeException($data['time'], $data['retry'], $data['message']);
      }
      return $next($request);
 }

source CheckForMaintenanceMode.php

Similar checks are made in the Worker.php and the QueueManager.php files.

Finally, the isDownForMaintenance() command is handled in your application container.

/**
 * Determine if the application is currently down for maintenance.
 *
 * @return bool
 */
 public function isDownForMaintenance()
 {
     return file_exists($this->storagePath().'/framework/down');
 }

source Application.php

Why the Current Solution Doesnt’t Work in a Distributed Environment

If your storage directory is shared across all your deployed instances, the current solution is fine.

For many deployments, we don’t share storage directories across all instances because we don’t handle file uploads.

It is also possible that your job queue is managed on a different server than that which is serving your webapp.

Solving Distributed Maintenance Mode.

Thankfully, we can follow a similar approach outlined in PR #16196 where a lock is checked to confirm that the kernel command is not already running elsewhere.

By using the cache, we can set a maintenance mode lock in Redis that is checked by all deployed instances before continuing the request lifecycle.

Plan of Attack

  • Create a maintenance mode enum.
  • Extend the php artisan down command and override the handle function.
  • Extend the php artisan up command and override the handle function.
  • Register the commands in the service provider and kernel.
  • Extend the application and override the isDownForMaintenance function.
  • Extend the CheckForMaintenanceMode middleware and override the handle function.

Creating the Enums

The first thing that we will do is create some enums for common string based cache keys that will be used throughout the maintenance mode code.

The reason we’re using enums is so that we can easily update the cache key without having to hunt down and potentially miss references throughout the application if we ever want to update the keys in the future.

Here, I’m just using an abstract class, but you could alternatively put these in a config file.

I find this approach cleaner and use it frequently.

<?php

namespace App\Utils\Enums;

abstract class CacheKeys
{
    /**
     * @var string A cache accessor for maintenance mode
     */
    const MAINTENANCE_PAYLOAD = 'laravel-maintenance-mode-payload';
}

Extending Artisan Down

Next, extend Laravel’s down command to use the cache instead of the filesystem.

Notice that CacheKeys::MAINTENANCE_PAYLOAD references the enum we created above. This will be used in all the extensions that are created.

<?php

namespace App\Console\Commands;

use App\Utils\Enums\CacheKeys;
use Cache;
use Illuminate\Foundation\Console\DownCommand as Down;

class DownCommand extends Down
{
    /**
     * Execute the console command.
     *
     * Overrides Laravel's DownCommand to use the cache instead of the
     * filesystem so that maintenance mode is propagated across all
     * servers and job queues.
     *
     * @return mixed
     */
    public function handle()
    {
        Cache::forever(CacheKeys::MAINTENANCE_PAYLOAD, json_encode($this->getDownFilePayload()));
        $this->comment('Application is now in maintenance mode.');
    }
}

The $this->getDownFilePayload serializes the options passed into the Artisan command at runtime and is in the original Down class.

/**
 * Get the payload to be placed in the "down" file.
 *
 * @return array
 */
protected function getDownFilePayload()
{
    return [
        'time' => $this->currentTime(),
        'message' => $this->option('message'),
        'retry' => $this->getRetryTime(),
    ];
}

If you wish, you can use the retry field to have the cache automatically purge the key/value after the given period of time and take the application out of maintenance mode.

Extending Artisan Up

Now, extend Laravel’s up command to purge the cache key that is being used as a lock, thus taking the application out of maintenance mode.

<?php

namespace App\Console\Commands;

use App\Utils\Enums\CacheKeys;
use Cache;
use Illuminate\Foundation\Console\UpCommand as Up;

class UpCommand extends Up
{
    /**
     * Execute the console command.
     *
     * Overrides Laravel's UpCommand to use the cache instead of the
     * filesystem so that maintenance mode is propagated across all
     * servers and job queues.
     *
     * @return mixed
     */
    public function handle()
    {
        Cache::forget(CacheKeys::MAINTENANCE_PAYLOAD);
        $this->info('Application is now live.');
    }
}

Register the New Commands

Now that we have extended the necessary core Laravel commands, we need to tell the application that when it is bootstrapped, it should use the modified commands instead of the Illuminate commands that ship with the framework.

This can easily be accomplished in the AppServiceProvider boot method.

/**
 * Bootstrap any application services.
 *
 * @return void
 */
 public function boot()
 {
    /**
     * Override Laravel's "php artisan down" command to put the application in maintenance mode
     * using our custom Redis based lock.
     */
    $this->app->extend('command.down', function () {
        return new DownCommand();
    });

    /**
     * Override Laravel's "php artisan up" command to bring the application out of maintenance mode
     * using our custom Redis based lock.
     */
     $this->app->extend('command.up', function () {
        return new UpCommand();
    });
}

Remember to import the commands.

use App\Console\Commands\DownCommand;
use App\Console\Commands\UpCommand;

You will also need to add the commands to the console kernel like you do all other commands.

<?php

namespace App\Console;

use App\Console\Commands\DownCommand;
use App\Console\Commands\UpCommand;
use Illuminate\Foundation\Console\Kernel as ConsoleKernel;

class Kernel extends ConsoleKernel
{
    /**
     * The Artisan commands provided by your application.
     *
     * @var array
     */
    protected $commands = [
        DownCommand::class,
        UpCommand::class,
    ];

    /**
     * Register the Closure based commands for the application.
     *
     * @return void
     */
    protected function commands()
    {
        require base_path('routes/console.php');
    }
}

Extend Laravel’s Application Class

Now that we are using the cache instead of the filesystem to store the maintenance mode lock, the framework needs to be directed to the new location so that it knows when the application is in maintenance mode.

First, extend the core Application class so that we can override Laravel’s isDownForMaintenance method that is currently checking the filesystem for the maintenance lock.

I’ve created an App\Extensions\Illuminate\Foundation directory in an attempt to closely mirror the framework.

<?php

namespace App\Extensions\Illuminate\Foundation;

use App\Utils\Enums\CacheKeys;
use Cache;
use Illuminate\Foundation\Application as App;

class Application extends App
{
    /**
     * Returns whether the application is down for maintenance.
     *
     * Laravel maintenance mode is overridden to use the cache instead of the filesystem,
     * allowing us to treat Redis as a global distributed lock so that
     * maintenance mode is propagated to multiple servers.
     *
     * @return bool Whether the app is in maintenance mod.
     */
    public function isDownForMaintenance()
    {
        return Cache::has(CacheKeys::MAINTENANCE_PAYLOAD);
    }
}

This next step that needs to be taken when extending the application is to change how the web application is bootstrapped.

You will find the necessary file to modify in bootstrap/app.php.

The $app variable is now using our extended Application class created above, and thus anytime isDownForMaintenance is called, the cache will be checked for the existence of the maintenance mode lock instead of the storage directory.

/*
|--------------------------------------------------------------------------
| Create The Application
|--------------------------------------------------------------------------
|
| The first thing we will do is create a new Laravel application instance
| which serves as the "glue" for all the components of Laravel, and is
| the IoC container for the system binding all of the various parts.
|
*/

$app = new App\Extensions\Illuminate\Foundation\Application(
    realpath(__DIR__.'/../')
);

Extending Maintenance Mode Middleware

The final step that needs to be taken is to extend the maintenance mode middleware that ships with the framework.

This middleware is checked every time a request comes into the application. We want it to now query the cache lock that we have created instead of the storage directory.

<?php

namespace App\Http\Middleware;

use App\Utils\Enums\CacheKeys;
use Cache;
use Closure;
use Illuminate\Foundation\Http\Exceptions\MaintenanceModeException;
use Illuminate\Foundation\Http\Middleware\CheckForMaintenanceMode as MaintenanceMode;

class CheckForMaintenanceMode extends MaintenanceMode
{

    /**
     * Handle an incoming request.
     *
     * If the application is in maintenance mode, we retrieve the payload from the cache containing
     * the user set message and time to display on the error page when the MaintenanceModeException
     * is thrown.
     *
     * @param \Illuminate\Http\Request $request
     * @param \Closure $next
     * @return mixed
     */
    public function handle($request, Closure $next)
    {
        if ($this->app->isDownForMaintenance()) {
            $data = json_decode(Cache::get(CacheKeys::MAINTENANCE_PAYLOAD), true);

            throw new MaintenanceModeException($data['time'], $data['retry'], $data['message']);
        }

        return $next($request);
    }
}

Laravel uses the \Illuminate\Foundation\Http\Middleware\CheckForMaintenanceMode::class in the HTTP kernel global middleware grouping which needs to be replaced with our new custom middleware as such.

<?php

use App\Http\Middleware\CheckForMaintenanceMode;

class Kernel extends HttpKernel
{
    /**
     * The application's global HTTP middleware stack.
     *
     * These middleware are run during every request to your application.
     *
     * @var array
     */
    protected $middleware = [
        \Fideloper\Proxy\TrustProxies::class,
        CheckForMaintenanceMode::class,
        \Illuminate\Foundation\Http\Middleware\ValidatePostSize::class,
        \App\Http\Middleware\TrimStrings::class,
    ];
}

You should now be able to call php artisan down and php artisan up to bring the application in and out of maintenance mode!

Tests

Finally, its always a good idea to write tests for your code so that you can refactor in the future with more confidence.

I tend to write tests as I code, instead of before, to verify that something is working before I go manually test in the browser.

There are a couple different ways you can approach testing this feature.

The first is a straight forward integration test where you put the application into maintenance mode and then assert that the user cannot access the home route. That might look like the following.

public function test_home_should_not_be_accessible_in_maintenance_mode()
{
    // Arrange
    Artisan::call('down');

    $user = factory(User::class)->create();

    // Act && Assert
    $this
        ->actingAs($user)
        ->get(route('home'))
        ->assertStatus(503);
}

These next tests are arguably less valuable, but might give you more confidence when working with the codebase in the future.

public function test_isDownForMaintenance_returns_true_when_in_maintenance_mode()
{
    // Arrange
    Cache::forever(CacheKeys::MAINTENANCE_PAYLOAD, true);

    // Act && Assert
    $this->assertTrue($this->app->isDownForMaintenance());
}

public function test_isDownForMaintenance_returns_false_when_not_in_maintence_mode()
{
    // Arrange
    Cache::forget(CacheKeys::MAINTENANCE_PAYLOAD);

    // Act && Assert
    $this->assertFalse($this->app->isDownForMaintenance());
}

There are many more things you could test. For example, the middleware is passing through to the next middleware in the chain when the application is not in maintenance mode or that the Up and Down commands are setting the appropriate fields in the cache.

Production Environment Performance Difference

Unfortunately, I can’t give a true apples to apples comparison as this was not tested in a controlled environment. The deployment of this upgrade also enabled the OPcache. If you aren’t already using the OPcache, I highly recommend doing some reading on it and enabling it on your production servers.

That being said, our production servers (m4.large) currently have a latency of around 80ms. Your performance will differ based on where your cache is located, the size of your server, php version, etc…

What You Should Be Aware Of

Finally, be aware that if you are using AWS Elastic Load Balancer with health checks, or a similar service, putting your app into maintenace mode will likely cause your health checks to fail.

You should always dry run these type of changes in a staging environment, which thankfully we did.

Shortly after putting the application into maintenance mode, health checks to our Laravel status endpoints began to return a 503 status code, causing the ELB to take the instances out of the rotation.

In hindsight, this should have been obvious.

A solution is to exempt your status check endpoints from the CheckForMaintenanceMode middleware, which gets run on every request.

To do this, you will want to follow a similar approach to how Laravel’s VerifyCsrfToken middleware works.

class VerifyCsrfToken
{
    /**
     * The URIs that should be excluded from CSRF verification.
     *
     * @var array
     */
    protected $except = [];

    /**
     * Handle an incoming request.
     *
     * @param  \Illuminate\Http\Request  $request
     * @param  \Closure  $next
     * @return mixed
     *
     * @throws \Illuminate\Session\TokenMismatchException
     */
    public function handle($request, Closure $next)
    {
        if (
            $this->isReading($request) ||
            $this->runningUnitTests() ||
            $this->inExceptArray($request) ||
            $this->tokensMatch($request)
        ) {
            return $this->addCookieToResponse($request, $next($request));
        }

        throw new TokenMismatchException;
    }

    /**
     * Determine if the request has a URI that should pass through CSRF verification.
     *
     * @param  \Illuminate\Http\Request  $request
     * @return bool
     */
    protected function inExceptArray($request)
    {
        foreach ($this->except as $except) {
            if ($except !== '/') {
                $except = trim($except, '/');
            }

            if ($request->is($except)) {
                return true;
            }
        }

        return false;
    }
}

Copy the inExceptArray() function to your CheckForMaintenaceMode middleware, which you can then check in the handle() function in a similar fashion to the VerifyCsrfToken middleware above.

Conclusion

We moved from a filesystem based maintenance mode lock to one stored in a Redis cache.

If you aren’t using Redis or a distributed cache, you can still follow the approach outlined above, just be aware that maintenance mode won’t propogate across all your servers.

Whether you should use this approach, over pointing your load balancer to a static maintenance page, I leave up to you.