EC2 Auto-Scaling with Spot and On-Demand Instances?

Solution 1:

The approach discussed above would be a little messy, and not so flexible. The more canonical approach is to just create 2 ASGs (one for spot, one for on-demand) and then register them both with the same ELB (discussed here). This gives you the ability to control each independently rather than trying to muck with LC swaps in a single ASG.

Solution 2:

This hybrid Auto Scaling approach doesn't seem to be available out of the box indeed, unfortunately.

However, you might be able to work around this limitation as follows (untested, just a system design I've been juggling around for a while):

Potential Workaround

As outlined in Using Auto Scaling to Launch Spot Instances, the spot price bid is a parameter of the Launch Configuration in use. As you pointed out, there is no hybrid launch configuration available, rather it must be either on-demand or spot, which means the use case requires two different launch configurations.

This doesn't seem to help right away, because You can attach only one launch configuration to an Auto Scaling group at a time, with the following (partially outdated) constraints (see Launch Configuration):

When you attach a new or updated launch configuration to your Auto Scaling group, any new instances will be launched using the new configuration parameters. Existing instances are not affected. When Auto Scaling needs to scale down, it first terminates instances that have an older launch configuration. [emphasis mine]

The emphasized parts are key though, with the former covering the requirement to keep the on-demand instances running after changing from the respective initial on-demand launch configuration to the additional spot launch configuration, and the latter not necessarily being the case anymore due to the recently introduced Auto Scaling Termination Policies (for a change there hasn't been the usually fanfare via an accompanying AWS blog post), documented in Instance Termination Policy for Your Auto Scaling Group:

Before Auto Scaling selects an instance to terminate, it first identifies the Availability Zone that has more instances than the other Availability Zones used by the group. If all Availability Zones have the same number of instances, it identifies a random Availability Zone. Within the identified Availability Zone, Auto Scaling uses the termination policy to select the instance for termination. [emphasis mine]

As outlined in How Your Termination Policy Works, you can now specify NewestInstance, if you want the last launched instance to be terminated, which would be one of the more recently launched spot instances:

Auto Scaling uses the instance launch time to identify the instance that was launched last.

Obviously there might be a bit more to this, e.g. you can either specify any one of the policies as a standalone policy, or you can list multiple policies in an ordered list, but this approach should ensure the load of all instances being factored into the auto-scaling measurements and triggers; one caveat remains though:

Caveat

If the load balancer terminates one of the on-demand instances for any other reason (e.g. because it has become unhealthy in itself), it wouldn't be replaced by an on-demand instance automatically. So you'd need to monitor and account for this event separately, e.g. by temporarily activating the on-demand launch configuration again.

Good luck!