Türchen 08: Magento Cron Demystified

When running a web shop there are a lot of tasks that need to be run on a regular basis. Also usually these tasks don’t require any user interaction and can run in the background. Magento comes with the “Mage_Cron” module that provides some structure to manage these background tasks.
Unfortunately this is not transparent at all. It’s hard to find out if cron has been configured properly, if it is working correctly and what tasks have been processed. In this post I’ll show you how to configure cron, how to manage existing tasks and implement own task and how to visualize what’s going on in the background.

Cron configuration

Magento’s cron is triggered by your operation system’s cron. The generic cron dispatcher that comes with Magento checks the configuration to decide what specific task needs to be executed in Magento. These tasks come with a cron-like configuration as well. But the presence of a Magento task only won’t make it execute without the operating system’s cron.

Granularity

Magento’s cron script should be called every five minutes (or even every minute). As a rule of thumb it needs to be called at least as often as the task in Magento with the highest frequency to avoid those tasks to pile up and to be run all at once the next time the scheduler is triggered.
If you’re hosting provider doesn’t allow you to run custom cron jobs with this frequency it might not be a good choice for a hosting provider anyways :)
Scheduling configuration
In System > Configuration > System > Cron (Scheduled Tasks) you can configure the scheduling behavior. Magento creates a schedule on a regular basis and stores records in the cron_schedule table (with status “pending”). The cron dispatcher then processes the pending tasks from that table. Make sure your “Schedule Ahead for” value is bigger than “Generate Schedules Every” to avoid any gaps while processing the tasks and to reduce the risk of missing to schedule a task.

cron_config-650x322

cron.php vs. cron.sh

In Magento’s root folder there’s a cron.php file and a cron.sh file. Briefly explained cron.sh internally calls cron.php and takes care of not executing more than one process in parallel. That’s a good thing, right? Well, sometimes… If you have long running tasks (e.g. a product import implemented as a task), these tasks will prevent anything else from being executed until the task has finished. You might have some other important tasks that shouldn’t wait or be skipped just because another task is running.
Check the “Missed if Not Run Within” setting to configure the maximum delay a task can have to still be executed.
On the other hand using cron.php will start a new process every time it’s being called which could easily result in performance problems or race conditions for tasks operating on the same data.
This is why cron.php should not be called directly. Find out more about cron groups later in this blog post for a nice solution on how to handle this situation.

cron1-650x349

So this is how to configure your crontab. Make sure it is run as the web server user!

$ sudo crontab -e -u www-data
* * * * * /bin/sh /var/www/magento/cron.sh

If you’re running your cron scheduler on the same machine your frontend runs on you might want to give the cron process a lower priority:
* * * * * <strong>nice -n 10 </strong>/bin/sh /var/www/magento/cron.sh

In order to avoid problems while deploying a new package or while doing maintenance you might want to check for the maintenance.flag before triggering cron.sh:
* * * * * <strong>! test -e /var/www/magento/maintenance.flag</strong> && nice -n 10 /bin/sh /var/www/magento/cron.sh

“Always” tasks

So if you already digged into the Magento cron stuff you might have noticed that Magento CE 1.8 and EE 1.13 introduced a new scheduling mode called “always” (instead of the cron syntax…). As the name says these tasks will unconditionally be executed every time cron is triggered and don’t need an explicitly defined schedules.
In the Enterprise Edition this is used to trigger the new changelog-based indexing. The Community Edition currently doesn’t seem to use this new feature. However, this feature is part of Mage_Cron and thus can be used for custom tasks in CE as well.
Looking at cron.php you’ll find the little mess that has been added to make this happen (also check this post). Basically cron.php being called without any parameters uses shell_exec to execute two processes of cron.sh. Each with a different parameter (“default” or “always”). Cron.sh in turn passes this parameter back to cron.php which then executes the cron. Internally Magento uses its event infrastructure to process the two modes by dispatching events with the vacuous names “default” and “always”. Mage_Cron implements two observer methods to do the actual magic: Mage_Cron_Model_Observer->dispatch() and Mage_Cron_Model_Observer->dispatchAlways()

cron_diagram-650x365

Keeping this in mind I suggest simplifying the process and configuring cron like this instead (add “nice” and maintenance.flag check if required…)

* * * * * /bin/sh /var/www/magento/cron.sh –malways 1
* * * * * /bin/sh /var/www/magento/cron.sh –mdefault 1

Protect cron.php from outside access

The cron.php file is a php file that is intended to be run from command line but could also be triggered from the browser. Some sources even recommend triggering the cron scheduler by calling this script over http on a regular basis (again, if this is a workaround to your hosting not allowing you to support cron, you’re hoster is probably not a good fit in the first place).
Cron tasks potentially can run much longer than your maximum execution time or put some extra load on the server. Also maybe you have a dedicated worker server to process background tasks. This is why cron.php should blocked from outside access and cron tasks should not run in your webserver’s context.

Create your own task

Creating an own cron task is simple. Add following snippet to your module’s config.xml file:

<config>
    [...]
    <crontab>
        <jobs>
            <yourtaskname>
                <schedule>
                    <cron_expr>*/5 * * * *</cron_expr>
                </schedule>
                <run>
                    <model>your_module/model::method</model>
                </run>
            </yourtaskname>
        </jobs>
    </crontab>
    [...]
</config>

This is the simplest way of adding a cron job. Mage_Cron will pick this up from the xml configuration and start scheduling it according to your cron expression (<XXXXX; add link!>). The model you specified in the run->model node will be executed by Magento. The only parameter is the current instance of the schedule object (Mage_Cron_Model_Schedule). Now it’s your turn to implement whatever you want to do in with this task.
Although it’s easy to hardcode the cron expression I recommend always sticking to the second option Magento offers. Instead of having a cron_expr node you should add a config_path node within the schedule node. This one points to a value stored in the system configuration allowing you to define a default configuration and having this value configured individually:
<config>
    [...]
    <crontab>
        <jobs>
            <yourtaskname>
                <schedule>
                    <config_path>your_module/your_section/cron_expr</config_path>
                </schedule>
                <run>
                    <model>your_module/model::method</model>
                </run>
            </yourtaskname>
        </jobs>
    </crontab>

    <default>
        <your_module>
            <your_section>
                <cron_expr>*/5 * * * *</cron_expr>
            </your_section>
        </your_module>
    </default>
    [...]
</config>

In your system.xml you could add a simple text field to have the cron expression configured directly. For a fancier interface with custom drop down fields use “adminhtml/system_config_source_cron_frequency”. (Check out Mage_Backup for an example on how to implement this)

Aoe_Scheduler

Magento’s build in cron scheduler is pretty simple and comes with some serious limitations if you’re trying to get things done efficiently or trying to find out what’s going on in the background. Check the Aoe_Scheduler (blog post/documentation: http://www.fabrizio-branca.de/magento-cron-scheduler.html, GitHub: https://github.com/fbrnc/Aoe_Scheduler) module for a lot of improvements:

  • Backend, cli and web service access to all tasks
  • Visual timeline
  • Disabling tasks
  • Better error, exception and return value handling.
  • Events that allow custom workflows and dependencies between tasks
  • Email notifications (on success or error)
  • Heartbeat
  • Process management (checkout the development branch for this experimental feature)
  • Cron groups
  • …and many more features
44aa43f2d0-650x369

Cron groups

One of Aoe_Scheduler’s feature is introducing support for cron groups. That means you can run multiple cron.sh commands (on the same server or use this as a strategy to balance background processes across multiple servers) in parallel. Having a closer look at cron.sh you’ll see that it accepts an optional second parameter that defaults to “cron.php”. This parameter tells cron.sh which php script to execute and cron.sh’s check if a process is already running takes the script name into account. This way you can have a defined number of tasks in parallel and making sure that tasks that should not overlap won’t. Let’s say we want to execute three processes in parallel:
  1. cron_always.php will process the “always” tasks
  2. cron_import.php will process your custom import tasks that might take some time
  3. cron_default.php will process all other tasks.
First we need to symlink (or copy, if you like) the cron.php to cron_always.php, cron_import.php and cron_default.php. This can be done in your deployment scripts or using modman. Then we create a comma separated list of all the importer tasks: e.g. ‘xx_import_products, xx_import_categories’ Change the cron configuration to following to allow Magento processing up to three processes in parallel (again, combine with “nice” and test for the maintence.flag if required…)
* * * * * /bin/sh /var/www/magento/cron.sh cron_always.php –malways 1
* * * * * /usr/bin/env SCHEDULER_WHITELIST=xx_import_products, xx_import_categories' /bin/sh /var/www/magento/cron.sh cron_import.php –mdefault 1
* * * * * /usr/bin/env SCHEDULER_BLACKLIST=xx_import_products, xx_import_categories' /bin/sh /var/www/magento/cron.sh cron_default.php –mdefault
The task execution now will look more like this (only showing the non-always processes…)
cron2-650x123


Ein Beitrag von Fabrizio Branca
Fabrizio's avatar

Fabrizio Branca (Twitter: @fbrnc) ist Lead Magento Developer bei AOE. Er lebt mit seiner Familie in San Francisco, California. Auf seiner Webseite http://www.fabrizio-branca.de bloggt er über TYPO3, Magento, Varnish, Selenium und seine Fotos. Außerdem sind dort auch einige seiner freien Magento Module wie zum Beispiel Aoe_Profiler, Aoe_Scheduler, Aoe_TemplateHints oder die Magento-Varnish-Integration Aoe_Static zu finden.

Alle Beiträge von Fabrizio

Kommentare
Hannes am

I have a problem with Magento 1.7.0.2 and --includeGroups... is there no support for it? The cron.sh still executes all cronjobs :-(

Gogo am

Nice read, very helpful, however I have a problem. I have a series of Crons which have the status Missing "Multiple tasks with the same job code were piling up. Skipping execution of duplicates". Can I reset/cancel those that are piled up? its the core_email_queue_send_all, which now sends no emails anymore.

Magento Configurable Cron | Garik'S Blog. am

[…] Türchen 08: Magento Cron Demystified […]

Grigory am

You are right. This one to big for me. But there you wrote http://joxi.ru/J2bJ5q1fyaLD26 So what is wrong?

Brad am

Grigory, you clearly didn't read the article. Whoever reads this, please read the article instead of doing what Grigory suggested.

Grigory am

You forgot abaut cron.php

For more information see the manual pages of crontab(5) and cron(8)

m h dom mon dow command

/2 * nice -n 10 /bin/sh /home/www-glazik/www/cron.sh cron.php

nice -n 10 /bin/sh /home/www-glazik/www/cron.sh cron.php -malways 1

nice -n 10 /bin/sh /home/www-glazik/www/cron.sh cron.php -mdefault 1

shivani am

I am using AOE_Scheduler few days ago its working fine but now its giving error of Last heartbeat is older than one hour. Please check your settings and your configuration! ...could you tell me what steps need to do.

Scott Buchanan am

Fabrizio, I reached out on Twitter too, as I'm not sure what the best way to contact you is. The cron groups you describe here aren't part of the public AOE_Scheduler repo. Is that code public? Google turned up a closed pull request that looks like it might have been the source for it, but I don't know if that was working code or not.

Magento-Neuigkeiten der Wochen 49/50 2013 am

[…] Magento Cron Demystified […]

Quentin am

Excellent article thanks guys!

Andreas von Studnitz am

Great insight, thanks a lot! And I hope you had a nice birthday - best wishes for you! :-)

Tobias Vogt am

Hey Fabrizio,

thanks a lot for your current blog post and: Happy Birthday :-) Have a great holiday in germany!

Greetings,

Tobi

Dein Kommentar