Answered: Help! WordPress keeps missing my scheduled posts!

I gave a new client a totally wrong answer the other day.  Well, it was right as far as I always knew.  Scheduled posts have been a bane to a bunch of blogs I’ve seen. I’d thought that there were bugs in wp-cron, that had been fought for years and years by dozens of people and no one had yet found the right fix.. I just stumbled across this post to the support forum made a year ago by Otto, well known WordPress Guru Supreme, that proved I was completely mistaken. With the additional source of completely swamped servers, I think this post explains every source of recent missed posts I’ve seen:

Short answer: Add this to the to defines in your wp-config.php file:

define(‘ALTERNATE_WP_CRON’, true);

Really long answer, for masochists: Scheduled posts are not now, and have never been, “broken”. The developers of WordPress cannot fix it because there is nothing to fix. The problem lies in the fact that your server, for some reason, cannot properly execute the wp-cron process. This process is WordPress’ timing mechanism, it handles everything from scheduled posts to sending pingbacks to XMLRPC pings, etc. The way it works is pretty simple. Whenever a WordPress page loads, internally WordPress checks to see if it needs to fire off wp-cron (by comparing the current time with the last time wp-cron ran). If it does need to run wp-cron, then it tries to make an HTTP connection back to itself, calling the wp-cron.php file. This connection back to itself is there for a reason. wp-cron has a lot of work to do, and that work takes time. Delaying the user seeing his webpage while it does a bunch of stuff is a bad idea, so by making that connection back to itself, it can run the wp-cron program in a separate process. Since WordPress itself doesn’t care about the result of the wp-cron, it only waits a second, then goes back to rendering the webpage for the user. Meanwhile, wp-cron, having been launched, does its work until it’s finished or it runs out of execution time. That HTTP connection is where some badly configured systems fail. Basically, WordPress is acting like a web browser. If your site was http://example.com/blog, then WP will call http://example.com/blog/wp-cron.php to start the process. However, some servers simply can’t do that for some reason. Among the possible reasons:

  • Server doesn’t have DNS, and so it can’t figure out who “example.com” is, even though it is *itself*.
  • Server administrators, in a misguided attempt at security, have blocked “loopback” requests, so it can’t actually make a call back to itself.
  • Server is running something called “mod_security” or similar, which actively blocks the call due to brain-dead configuration.
  • Something else.

The point is that for whatever reason, your web server is configured in some non-standard way that is preventing WordPress from doing its job. WordPress simply can’t fix that. However, if you have this condition, there is a workaround. Add this to the to defines in your wp-config.php file:

define(‘ALTERNATE_WP_CRON’, true);

This alternate method uses a redirection approach, which makes the users browser get a redirect when the cron needs to run, so that they come back to the site immediately while cron continues to run in the connection they just dropped. This method is a bit iffy sometimes, which is why it’s not the default.

I would add three things to this explanation:

  • mod_security can be configured to block the ip address of the server, the domain name itself or even wp-cron specifically.  You can do some testing of this by doing a wget for your domain from your server. (wget http://example.com/wp-cron.php) and look at the results.  If there’s a problem and you can’t change the mod_security config itself, the fix may be configuring .htaccess to specifically allow from your server’s external IP address access to that file.
  • The loopback address failure can be caused by the Linux networking subsystem failing to load completely. Diagnosing that is off topic here, but you could look for it and see if you can “ping 127.0.0.1” and your domain from your server. If that works, loopback config isn’t the problem. If you it doesn’t, try running “ifconfig lo up” and doing it again. If it works, you’ve got some networking issues to resolve. If once device fails due to a config error, the rest don’t come up in systems I’ve seen.
  • It has been reported and disputed that the line in cron.php that calls the script has  a timeout that is too short or a value that is incompatible with some configurations. I also suspect that this can be a problem on extremely overloaded systems.  However, the fix might be worse than the original symptom.

The line in question is:

wp_remote_post($cron_url, array(‘timeout’ => 0.01, ‘blocking’ => false));

Some people have changed 0.01 to 1 and it works however, this adds a full second to every page load that calls wp-cron – and that could be very costly Google wise.  So be conservative  using this “fix”. On your overloaded system, adding milliseconds or seconds or more might not be noticeable, but then again – you probably have more pressing things you should be worrying about than a few missed posts.

HOW TO: Add menus to the WordPress 3.1 admin bar

This was posted to the wp-hackers list by Frank Bültge and I didn’t want to lose it.  Here is how to add a new menu item to the admin bar:

function wp_codex_search_form() {
  global $wp_admin_bar, $wpdb;

  if ( !is_super_admin() || !is_admin_bar_showing() )
    return;
   $codex_search = '<form target="_blank" method="get" action="
http://wordpress.org/search/do-search.php" style="margin:2px 0 0;">
 <input type="text" onblur="this.value=(this.value==\'\') ? \'Search the
Codex\' : this.value;" onfocus="this.value=(this.value==\'Search the
Codex\') ? \'\' : this.value;" maxlength="100" value="Search the Codex"
name="search">
 <button type="submit">
<span>Go</span>
 </button>
</form>';

 /* Add the main siteadmin menu item */
  $wp_admin_bar->add_menu( array( 'id' => 'codex_search', 'title' => 'Search
Codex', 'href' => FALSE ) );
   $wp_admin_bar->add_menu( array( 'parent' => 'codex_search', 'title' =>
$codex_search, 'href' => FALSE ) );
}

add_action( 'admin_bar_menu', 'wp_codex_search_form', 1000 );

Alternate version of “The Loop”

OK So, most theme developers are aware of what the loop is… it displays your WordPress posts.

The one interesting thing about the loop is that it bounces in and out of the php and straight html output.  So, most of the functions that are called do the echoes themselves.  But what if need the output in a variable, or want to code it so that the call is part of its own echo function? You simply cannot use the same functions.

Fortunately WordPress provides two versions  of each of these core functions.  For example there is the_content and get_the_content() also the_permalink() and get_permalink() (yes, that inconsistency has ALWAYS bothered me..)

Here is a version of  “The Loop” which calls all of the alternate versions of the functions:

if (have_posts()) {

while (have_posts()) {

the_post();
echo ‘<div id=”post-‘ . the_ID() . ‘”>’;
echo ‘ <div></div>’ . “\n”;
echo ‘ <div>’ . “\n”;
echo ‘ <div>’ . “\n”;
echo ‘ <h2><a href=”‘ . get_permalink() . ‘” rel=”bookmark” title=”Permanent Link to ‘ . the_title_attribute(‘echo=0’) . ‘”>’ . the_title(‘echo=0’) . ‘</a></h2>’ . “\n”;
echo ‘ <small> <!– by ‘ . get_the_author() . ‘ –></small>’ . “\n”;
echo ‘ <div>’ . “\n”;
$content = get_the_content();
$content = apply_filters(‘the_content’, $content);
$content = str_replace(‘]]>’, ‘]]&gt;’, $content);
echo $content . “\n”;
echo ‘ </div>’ . “\n”;
echo ‘ </div>’ . “\n”;
echo ‘ </div>’ . “\n”;
echo ‘ <div></div>’ . “\n”;
echo ‘</div>’ . “\n”;

}

echo ‘<div>’ . “\n”;
echo ‘ <div>’ . next_posts_link(‘&laquo; Older Entries’) . ‘</div>’ . “\n”;
echo ‘ <div>’ . previous_posts_link(‘Newer Entries &raquo;’) . ‘</div>’ . “\n”;
echo ‘</div>’ . “\n”;

} else {

echo ‘<h2>Not Found</h2>’ . “\n”;
echo ‘<p>’ . “Sorry, but you are looking for something that isn’t here.” . ‘</p>’ . “\n”;
get_search_form();

}

Determining what versions of WordPress you are hosting

If you host lots of different sites for people, one of the things you might want to know is what versions of WordPress each site is running.

WordPress stores the version number in a variable named $wp_version which is set in the file version.php.

With that information in hand, you can write a bash command that you run from your /home directory to display all of the WordPress versions you have on your server:

find . -name version.php -type f|xargs grep ^\$wp_version

This is one of the aliases I have in my .bashrc file.

WordPress Security – a plugin done right.

It’s rare that I open the source code for a random plugin and see every recommended security measure taken. When looking at Chris Boyd‘s plugin GeoLocation Plugin, I kept digging deeper and deeper and found he’d consistently covered everything. This plugin is a text book example of how to write a secure plugin.

Since he has so many techniques in here, this post wrote itself in my head. It was irresistible.

I strongly recommend you download the plugin and open geolocation.php to follow along

So, here are the security methods I saw him use, there may be more I didn’t catch, if so feel free to add them to the comments:

Use nonces on all input forms

Chances are your plugin will ask for information. A nonce is a security token that helps guarantee that you are getting real data from a real user. In function geolocation_inner_custom_box() Chris calls

echo '<input type="hidden" id="geolocation_nonce" name="geolocation_nonce" 
	value="' .     wp_create_nonce(plugin_basename(__FILE__) ) . '" />';

to add the nonce to his input form and then to check this on the processing side in function geolocation_save_postdata he calls:

  // Check authorization, permissions, autosave, etc
  if (!wp_verify_nonce($_POST['geolocation_nonce'], plugin_basename(__FILE__)))
    return $post_id;

This is very simple code. There’s almost no cost to adding nonces to form processing and you can use that exact code, just changing the name of the field. This will block most bots from abusing your plugin.

Verify the user’s permissions

The user_can functions go hand in hand with the nonces and they work together to block the same kind of attacks. If your plugin does something that just anyone off the street shouldn’t be able to do, verify that the user is allowed to do it. This check is just a little bit further into function geolocation_save_postdata:

  if('page' == $_POST['post_type'] ) {
    if(!current_user_can('edit_page', $post_id))
		return $post_id;
  } else {
    if(!current_user_can('edit_post', $post_id))
		return $post_id;
  }

This simple check to confirm that the user is allowed to edit a post before the geotagging information is added and ties the plugin directly into all of the security already built into WordPress. It is tremendously powerful.

Configure default values

In programming 101, you learn never to trust the computer to give you a clean value for an uninitialized variable. In web development, there are even more implications to this. Caching is one of those. If you call get_option and WordPress does not have a value, it will access the database to see if there is a value set. Until a value is stored in the database, WordPress has to keep looking for it on every page load. Anything you can do that avoids accessing the database files and quite possibly the hard drives where the information is stored, will make your site run faster. If a value is found, that value will be loaded from the cache automatically. Additionally, if you set your default values, you know what you are getting back. Chris handles this in function default_settings():

function default_settings() {
	if(get_option('geolocation_map_width') == '0')
		update_option('geolocation_map_width', '450');

	if(get_option('geolocation_map_height') == '0')
		update_option('geolocation_map_height', '200');

	if(get_option('geolocation_default_zoom') == '0')
		update_option('geolocation_default_zoom', '16');

	if(get_option('geolocation_map_position') == '0')
		update_option('geolocation_map_position', 'after');
}

Use the Settings API and let WordPress do all the work

WordPress 2.7 added a wonderful set of tools called the Settings API. Most plugins do not take advantage of this complex tool set when they could. Chris uses the register_setting API call in function register_settings to enumerate the type of data he wants in each field.

function register_settings() {
  register_setting( 'geolocation-settings-group', 
		'geolocation_map_width', 'intval' );
  register_setting( 'geolocation-settings-group', 
		'geolocation_map_height', 'intval' );
  register_setting( 'geolocation-settings-group', 
		'geolocation_default_zoom', 'intval' );
  register_setting( 'geolocation-settings-group', 
		'geolocation_map_position' );
  register_setting( 'geolocation-settings-group', 
		'geolocation_wp_pin');
}

As you can see he specifies that certain values must be integers. Chris doesn’t take full advantage of the API in this plugin. In fact, I’m not convinced that simply calling register_setting on its own does anything. It is meant to be used with other calls. Had he used them all WordPress could have handled all sorts of things for him automatically, including the nonce field and additional checks.

The settings API is POWERFUL but has a steep learning curve. Once you pass the learning curve your input form printing just becomes five lines:

<form action="options.php" method="post">
	<?php settings_fields('plugin_options'); ?>
	<?php do_settings_sections(__FILE__); ?>
	<input name="Submit" type="submit" 
		value="<?php esc_attr_e('Save Changes'); ?>" />
</form>

This topic is WAAAAAAY to big to cover here and I freely admit, I’m no expert on it. I recommend that you read more: A quick & dirty example or The Whole Shebang

Verify your data

The last two topics are related as the come from the same rule: ALL data that you get from ANY source must be verified. This is where many plugins fail. The assumption is that “if it is in the database, it must be clean” or “it’s from an RSS feed, I know the data is good” or “The variable has INT in the name so it obviously contains an integer” or any of a thousand other gotyas. The way to keep yourself safe from an assumption is to VERIFY.
I want to highlight just two examples of this in Chris’s code.

First, in function display_location, he uses a cast to ensure the data he gets is what he wants from post meta:

$public = (bool)get_post_meta($post->ID, 'geo_public', true);

Second a function clean_coordinates uses regex to ensure the value he receives is what he wants:

$latitude = clean_coordinate($_POST['geolocation-latitude']);
[..]
function clean_coordinate($coordinate) {
	$pattern = '/^(\-)?(\d{1,3})\.(\d{1,15})/';
	preg_match($pattern, $coordinate, $matches);
	return $matches[0];
}

Escape all variable data you display

I won’t show specific examples of this as they are everywhere throughout the whole plugin, but this is one of the most important steps you can take to make your plugin secure. It is also one of the areas that has has the most focus in the WordPress core over the years.

If you are displaying a value that isn’t hard coded in your source, you need to first feed it through one of these functions (or something similar):

esc_attr()	Cleans HTML attributes for use on screen
esc_html()	Prepares complete blocks of HTML code
esc_js()	Special javascript handling of quotes and EOL characters
esc_sql()	Escapes data for use in a query
esc_url()	Returns a valid URL if possible
esc_url_raw()	Prepares an URL to be inserted in a database

This applies to values you are putting into javascript, urls you are displaying on the screen or using for includes, and data used to build image references. All this information must be sanitized. Visit the codex for more information on data validation.

Bonus Tip

If you’ve made it this far in the post, you get one more tip for free. DON’T DO IT ALL YOURSELF. I’d be willing to wager that this plugin was not ‘born’ with all of these security techniques in place. Though this plugin is attributed to Chris in the repository, he lists multiple authors for the plugin. You don’t HAVE to do all of this by yourself. Once your plugin is done, ask people to review it. I can pretty much guarantee that someone will find something. That’s just how it works. But won’t you feel much better if your friend catches the problem before it is discovered because your plugin added an avenue of attack to every site it’s been installed on? Someone will be glad to review it for you it. That’s part of what is great about the WordPress community: people love to help.

Summation

Website security will forever be an on going battle. It is a big, complicated subject with nuances you can miss easily. However, WordPress has been around long enough to have build up quite a collection of tools you have at your disposal as a plugin author. If use the tools WordPress provides, following Chris’s example outlined here, you can rest easy at night knowing you’ve done your job well.

WordPress DB Hacks: Determining the ID of a category parents

When manipulating WordPress databases for exports and merges, sometimes it is helpful to get a list of all of the parents for the categories your posts are in.

For the import I am working on right now.  The category list is being flattened from 30 categories down to 5 categories. The blog has just over 75 thousand posts in it and doing the conversion manually would be a rather arduous process to say the least.

So, I came up with this little query to give me the parent category for each of the child categories on that site.

It’s fairly simple but I’ve had to write it several times now and thought maybe someone else might need it sometime too:

SELECT `tt1`.`term_taxonomy_id` as 'Child Tax ID', `t1`.`term_id` as 'Child Term ID', `t1`.`name` as 'Child Name', `tt2`.`term_taxonomy_id` as 'Parent Tax ID', t2.`term_id` as 'Parent Term ID', `t2`.`name` as 'Parent Name' FROM `wp_term_taxonomy` `tt1`, `wp_term_taxonomy` `tt2`, `wp_terms` `t1`, `wp_terms` `t2` where `tt1`.`term_id` = `t1`.`term_id` and `tt1`.`parent` = `t2`.`term_id` and `tt1`.`taxonomy` = 'category' and `tt2`.`term_id` = `t2`.`term_id` and `tt2`.taxonomy = 'category';

Now I can just grab it instead of rewriting it again next time.  After all, this is The Code Cave…

PS – If you just want to see the top level categories, here’s how:

SELECT `tt1`.`term_taxonomy_id`, `t1`.`term_id`, `t1`.`name` FROM `wp_term_taxonomy` `tt1`, `wp_terms` `t1`where `tt1`.`term_id` = `t1`.`term_id` and `tt1`.`taxonomy` = 'category' and `tt1`.`parent`=0