elasticsearch - what to do with unassigned shards

This is a common issue arising from the default index setting, in particularly, when you try to replicate on a single node. To fix this with transient cluster setting, do this:

curl -XPUT http://localhost:9200/_settings -d '{ "number_of_replicas" :0 }'

Next, enable the cluster to reallocate shards (you can always turn this on after all is said and done):

curl -XPUT http://localhost:9200/_cluster/settings -d '
{
    "transient" : {
        "cluster.routing.allocation.enable": true
    }
}'

Now sit back and watch the cluster clean up the unassigned replica shards. If you want this to take effect with future indices, don't forget to modify elasticsearch.yml file with the following setting and bounce the cluster:

index.number_of_replicas: 0

There are many possible reason why allocation won't occur:

  1. You are running different versions of Elasticsearch on different nodes
  2. You only have one node in your cluster, but you have number of replicas set to something other than zero.
  3. You have insufficient disk space.
  4. You have shard allocation disabled.
  5. You have a firewall or SELinux enabled. With SELinux enabled but not configured properly, you will see shards stuck in INITIALIZING or RELOCATING forever.

As a general rule, you can troubleshoot things like this:

  1. Look at the nodes in your cluster: curl -s 'localhost:9200/_cat/nodes?v'. If you only have one node, you need to set number_of_replicas to 0. (See ES documentation or other answers).
  2. Look at the disk space available in your cluster: curl -s 'localhost:9200/_cat/allocation?v'
  3. Check cluster settings: curl 'http://localhost:9200/_cluster/settings?pretty' and look for cluster.routing settings
  4. Look at which shards are UNASSIGNED curl -s localhost:9200/_cat/shards?v | grep UNASS
  5. Try to force a shard to be assigned

    curl -XPOST -d '{ "commands" : [ {
      "allocate" : {
           "index" : ".marvel-2014.05.21", 
           "shard" : 0, 
           "node" : "SOME_NODE_HERE",
           "allow_primary":true 
         } 
      } ] }' http://localhost:9200/_cluster/reroute?pretty
    
  6. Look at the response and see what it says. There will be a bunch of YES's that are ok, and then a NO. If there aren't any NO's, it's likely a firewall/SELinux problem.