Notty Notty! re:”All my Server CPU is used by root@notty!! Have I been hacked?”

First – Breathe.  “notty” stands for “no teletypewriter”. Programs that connect to the server but don’t want the output displayed any where use a “No TTY” connection. So if you see “ssh: *@notty” on a task list somewhere, it just means there’s an ssh login on your server does not have a visual interface assigned to it.

This can appear during many different relatively common server activities. So it is not the tag of some hacker as you might have feared. One of the most common examples is the use of the scp command. scp remotely copies files from one computer to another. When it connects to the remote computer, it isn’t displaying that communication to a screen, so the connection is a notty connection.

Below is a partial screen scrape of a “top -c” command when scp is running:

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
32706 root      15   0 14284 7116 2336 S 68.1  0.3   6:15.37 sshd: root@notty
32709 root      18   0  6788 1468 1124 R  4.0  0.1   0:20.84 scp -r -f /home2

As you can see the cpu usage was pretty high and that’s what gets people worried. They are probably looking at “top” see how much longer it will take to finish copying files. Then they see that scp is taking up nothing while a task named “notty” is taking up huge amounts of CPU and they think someone is being “naughty”.  Now you know what is really happening.

So relax! It’s all good.

How to Add a TXT record to your 1and1 domain & How to use external DNS for a 1and1 hosted site.

Unfortunately there are lots of registrars that don’t allow you full access to your DNS settings.  1and1.com is one of these. If you host your site with 1and1.com and you want to add a TXT record to your domain for verification purposes or to set a SPF record whatever, you simply can’t do it… unless…

If you have access to another DNS server that allows you to edit your DNS zone and add TXT records, you can set your 1and1 domain to use that DNS server.  However, THEN you must edit the DNS zone on that server and have all of the A records point back to the IP address for you 1and1 account.

Here are the steps:

  1. Ping your site and write down the IP address
  2. Go to the 1and1 admin domain listing
  3. Select the row for your domain (should be the only one checked) and the click DNS->Edit DNS.
  4. Select My DNS Servers
  5. Enter the URLS for your other DNS server (Something like ns1.example.com and ns2.example.com
  6. Close and save and allow a few hours for this to update before testing it.
  7. In the mean time, go to your other DNS server and setup a new DNS zone for your domain.  It needs to have at LEAST an a record pointing to the IP address you wrote down for step one. You probably also want setup a cname for your www subdomain.  You also can setup your TXT record.
  8. After hitting save wait for a few hours and you should be done.

 

If you are a visual learner, here is a screen cast.

Generating random names in MySQL

I’ve improved my earlier random string generation procedures to better suit my needs. So I created a Random Name Generator for MySQL.

I’ve created two new procedures. They pick from the 100 most popular first names (well actually the 50 most popular male and 50 most popular female first names for the US) and the 100 most popular surnames (for the US).

Using these two procedures generate_fname() and generate_lname() you can create realistic random names and email addresses for your tests.

You can download the SQL here.

How do I create a random string in MySQL?

There are lots of quick and dirty ways to create a random strings in mysql.
If you want letters and numbers, just do this:

SELECT LOWER(
SUBSTRING(
md5( RAND( 4 ) ) ,
FLOOR ( 7 + ( RAND( ) * 14 ) ) ,
FLOOR( 3 + ( RAND () * 4 ) ) ) ) AS fname

However this method will allow you to create the a random string with a specifc set of characters. For example you can specify that you want only alpha characters and no numbers. Or you could generate hexadecimal numbers by specifying just that character set:

DROP function if exists generate_word;
DELIMITER $$
CREATE FUNCTION generate_word (counter smallint) RETURNS varchar(255)
BEGIN
DECLARE result CHAR(255) ;
DECLARE oldword CHAR(255) ;
DECLARE newchar CHAR(1) ;
set result = “”;
repeat
set result = CONCAT(ELT(FLOOR(1 + (RAND() * (50-1))), ‘a’,’b’,’c’,’d’,’e’,’f’,’g’,’h’,’i’,’j’,’k’,’l’,’m’,’n’,’o’,’p’,’q’,’r’,’s’,’t’,’u’,’v’,’w’,’x’,’y’, ‘z’,
‘A’,’B’,’C’,’D’,’E’,’F’,’G’,’H’,’I’,’J’,’K’,’L’,’M ‘,’N’,’O’,’P’,’Q’,’R’,’S’,’T’,’U’,’V’,’W’,’X’,’Y’, ‘Z’ ),
result);
set counter = counter – 1;
until counter = 0
end repeat;
RETURN result;
END$$

DELIMITER ;
select generate_word(FLOOR( 3 + ( RAND( ) * 14 ) )),generate_word(FLOOR( 3 + ( RAND( ) * 14 ) )),generate_word(FLOOR( 3 + ( RAND( ) * 14 ) )),generate_word(FLOOR( 3 + ( RAND( ) * 14 ) )),generate_word(FLOOR( 3 + ( RAND( ) * 14 ) ));

Hope that helps someone

Simple $wpdb error handling

For my own copy and paste pleasures, here are some wpdb error handling examples. It will grow over time:

A very simple non-failing method for in a function:


if ($wpdb->query($query) === FALSE) {
	return FALSE;
} else {
	return $wpdb->get_results($query);
}

A more complex method for in a function. It uses WordPress’s built in WP_Error class:


if ( false === $wpdb->query($sql)) {
	if ( $wp_error ) {
		return new WP_Error( 'db_query_error', 
			__( 'Could not execute query' ), $wpdb->last_error );
	} else {
		return 0;
	}
}

A very simple flow breaking method, in-line procedural:


if ($wpdb->query($query) === FALSE) {
	wp_die( __('Crap! well that’s screwed up: ' . $wpdb->last_error) ); 
}

Story line of the Amazon EC2 and RDS failure and recovery

I wanted to read the timeline of the Amazon “Networking Event”.  So I’ve taken the logs of the EC2 and RDS status updates and put them together for a post.

If you want to learn about how to make a more robust Amazon Web Services (AWS) configuration, read my article on the WebDevStudios blog.

RDS Apr 21, 1:48 AM PDT We are currently investigating connectivity and latency issues with RDS database instances in the US-EAST-1 region.

RDS Apr 21, 2:16 AM PDT We can confirm connectivity issues impacting RDS database instances across multiple availability zones in the US-EAST-1 region.

RDS Apr 21, 3:05 AM PDT We are continuing to see connectivity issues impacting some RDS database instances in multiple availability zones in the US-EAST-1 region. Some Multi AZ failovers are taking longer than expected. We continue to work towards resolution.

RDS Apr 21, 4:03 AM PDT We are making progress on failovers for Multi AZ instances and restore access to them. This event is also impacting RDS instance creation times in a single Availability Zone. We continue to work towards the resolution.

RDS Apr 21, 5:06 AM PDT IO latency issues have recovered in one of the two impacted Availability Zones in US-EAST-1. We continue to make progress on restoring access and resolving IO latency issues for remaining affected RDS database instances.

RDS Apr 21, 6:29 AM PDT We continue to work on restoring access to the affected Multi AZ instances and resolving the IO latency issues impacting RDS instances in the single availability zone.

RDS Apr 21, 8:12 AM PDT Despite the continued effort from the team to resolve the issue we have not made any meaningful progress for the affected database instances since the last update. Create and Restore requests for RDS database instances are not succeeding in US-EAST-1 region.

RDS Apr 21,10:35 AM PDT We are making progress on restoring access and IO latencies for affected RDS instances. We recommend that you do not attempt to recover using Reboot or Restore database instance APIs or try to create a new user snapshot for your RDS instance – currently those requests are not being processed.

RDS Apr 21, 2:35 PM PDT We have restored access to the majority of RDS Multi AZ instances and continue to work on the remaining affected instances. A single Availability Zone in the US-EAST-1 region continues to experience problems for launching new RDS database instances. All other Availability Zones are operating normally. Customers with snapshots/backups of their instances in the affected Availability zone can restore them into another zone. We recommend that customers do not target a specific Availability Zone when creating or restoring new RDS database instances. We have updated our service to avoid placing any RDS instances in the impaired zone for untargeted requests.

RDS Apr 21, 2:41 11:42 PM PDT In line with the most recent Amazon EC2 update, we wanted to let you know that the team continues to be all-hands on deck working on the remaining database instances in the single affected Availability Zone. It’s taking us longer than we anticipated. When we have an updated ETA or meaningful new update, we will make sure to post it here. But, we can assure you that the team is working this hard and will do so as long as it takes to get this resolved.

RDS Apr 22, 2:41 7:08 AM PDT In line with the most recent Amazon EC2 update, we are making steady progress in restoring the remaining affected RDS instances. We expect this progress to continue over the next few hours and we’ll keep folks posted.

RDS Apr 22, 2:41 2:43 PM PDT We are continuing to make progress in restoring access to the remaining affected RDS instances. We expect this progress to continue over the next few hours and we’ll keep folks posted.

EC2 Apr 22, 2:41 AM PDT We continue to make progress in restoring volumes but don’t yet have an estimated time of recovery for the remainder of the affected volumes. We will continue to update this status and provide a time frame when available.
EC2 Apr 22, 6:18 AM PDT We’re starting to see more meaningful progress in restoring volumes (many have been restored in the last few hours) and expect this progress to continue over the next few hours. We expect that well reach a point where a minority of these stuck volumes will need to be restored with a more time consuming process, using backups made to S3 yesterday (these will have longer recovery times for the affected volumes). When we get to that point, we’ll let folks know. As volumes are restored, they become available to running instances, however they will not be able to be detached until we enable the API commands in the affected Availability Zone.

EC2 Apr 22, 8:49 AM PDT We continue to see progress in recovering volumes, and have heard many additional customers confirm that they’re recovering. Our current estimate is that the majority of volumes will be recovered over the next 5 to 6 hours. As we mentioned in our last post, a smaller number of volumes will require a more time consuming process to recover, and we anticipate that those will take longer to recover. We will continue to keep everyone updated as we have additional information.

EC2 Apr 22, 2:15 PM PDT In our last post at 8:49am PDT, we said that we anticipated that the majority of volumes “will be recovered over the next 5 to 6 hours.” These volumes were recovered by ~1:30pm PDT. We mentioned that a “smaller number of volumes will require a more time consuming process to recover, and we anticipate that those will take longer to recover.” We’re now starting to work on those. We’re also now working to enable customers to be able to launch EBS backed instances and create, delete, attach and detach EBS volumes in the affected Availability Zone. Our current estimate is that this will take 3-4 hours until full access is restored. We will continue to keep everyone updated as we have additional information.

EC2 Apr 22, 6:27 PM PDT We’re continuing to work on restoring the remaining affected volumes. The work we’re doing to enable customers to be able to launch EBS backed instances and create, delete, attach and detach EBS volumes in the affected Availability Zone is taking considerably more time than we anticipated. The team is in the midst of troubleshooting a bottleneck in this process and we’ll report back when we have more information to share on the timing of this functionality being fully restored.

EC2 Apr 22, 9:11 PM PDT We wanted to give a more detailed update on the state of our recovery. At this point, we have recovered a large number of the stuck volumes and are in the process of recovering the remainder. We have added significant storage capacity to the cluster, and storage capacity is no longer a bottleneck to recovery. Some portion of these volumes have lost the connection to their instance, and are waiting to be connected before normal operations can resume. In order to re-establish this connection, we need to allow the instances in the affected Availability Zone to access the EC2 control plane service. There are a large number of control plane requests being generated by the system as we re-introduce instances and volumes. The load on our control plane is higher than we anticipated. We are re-introducing these instances slowly in order to moderate the load on the control plane and prevent it from becoming overloaded and affecting other functions. We are currently investigating several avenues EC2 to unblock this bottleneck and significantly increase the rate at which we can restore control plane access to volumes and instances– and move toward a full recovery. The team has been completely focused on restoring access to all customers, and as such has not yet been able to focus on performing a complete post mortem. Once our customers have been taken care of and are fully back up and running, we will post a detailed account of what happened, along with the corrective actions we are undertaking to ensure this doesn’t happen again. Once we have additional information on the progress that is being made, we will post additional updates.

RDS Apr 23, 12:00 AM PDT We are continuing to work on restoring access to the remaining affected RDS instances. We expect the restoration process to continue over the next several hours and we’ll update folks as we have new information.


EC2 Apr 23, 1:55 AM PDT We are continuing to work on unblocking the bottleneck that is limiting the speed with which we can re-establish connections between volumes and their instances. We will continue to keep everyone updated as we have additional information.

RDS Apr 23, 8:45 AM PDT We have made significant progress in resolving stuck IO issues and restoring access to RDS database instances and now have the vast majority of them back operational again. We continue to work on restoring access to the small number of remaining affected instances and we’ll update folks as we have new information.

EC2 Apr 22, 8:54 AM PDT We have made significant progress during the night in manually restoring the remaining stuck volumes, and are continuing to work through the remainder. Additionally we have removed some of the bottlenecks that were preventing us from allowing more instances to re-establish their connection with the stuck volumes, and the majority of those instances and volumes are now connected. We’ve encountered an additional issue that’s preventing the recovery of the remainder of the connections from being established, but are making progress. Once we solve for this bottleneck, we will work on restoring full access for customers to the control plane.

EC2 Apr 22, 11:54 AM PDT Quick update. We’ve tried a couple of ideas to remove the bottleneck in opening up the APIs, each time we’ve learned more but haven’t yet solved the problem. We are making progress, but much more slowly than we’d hoped. Right now we’re setting up more control plane components that should be capable of working through the backlog of attach/detach state changes for EBS volumes. These are coming online, and we’ve been seeing progress on the backlog, but it’s still too early to tell how much this will accelerate the process for us. For customers who are still waiting for restoration of the EBS control plane capability in the impacted AZ, or waiting for recovery of the remaining volumes, we understand that no information for hours at a time is difficult for you. We’ve been operating under the assumption that people prefer us to post only when we have new information. Think enough people have told us that they prefer to hear from us hourly (even if we don’t have meaningful new information) that we’re going to change our cadence and try to update hourly from here on out.

EC2 Apr 22, 12:46 PM PDT We have completed setting up the additional control plane components and we are seeing good scaling of the system. We are now processing through the backlog of state changes and customer requests at a very quick rate. Barring any setbacks, we anticipate getting through the remainder of the backlog in the next hour. We will be in a brief hold after that, assessing whether we can proceed with reactivating the APIs.

RDS Apr 23, 12:54 PM PDT As we mentioned in our last update at 8:45 AM, we now have the vast majority of affected RDS instances back operational again. Since that post, we have continued to work on restoring access to the small number of remaining affected instances. RDS uses EBS, and as such, our pace of recovery is dependent on EBS’s recovery. As mentioned in the most recent EC2 post, EBSs recovery has gone a bit slower than anticipated in the last few hours. This has slowed down RDS recovery as well. We understand how significant this service interruption is for our affected customers and we are working feverishly to address the impact. We’ll update folks as we have new information. Additionally we have heard from customers that you prefer more frequent updates, even if there has been no meaningful progress. We have heard that feedback, and will try to post hourly updates here. Some of these updates will point to EC2’s updates (as they continue to recover the rest of EBS volumes), but we’ll post nonetheless.

HOW TO: How do you redirect all https traffic to http?

There are LOTS of ways to do this.

 

This is what I do so that I can use the same code in any site’s .htaccess

RewriteEngine On
RewriteBase /
Options +FollowSymlinks
RewriteCond %{HTTPS} =on
RewriteRule ^(.*)$ http://%{SERVER_NAME}/$1 [R=301,L]

 

I like this better than checking for port 443 because sometimes load balancers handle the certificate and encryption then send the information to port 80.   This works in that situation. Of course you can adjust this code to redirect to www.%{SERVER_NAME} if that is your preference.

 

Of course redirecting all https traffic to http is equally simple by adding a ! or not to the https check and adjusting your target.

RewriteEngine On
RewriteBase /
Options +FollowSymlinks
RewriteCond %{HTTPS} !=on
RewriteRule ^(.*)$ https://%{SERVER_NAME}/$1 [R=301,L]