/ DevOps

How to: Terraform Locking State in S3 Amazon Web Services

Terraform is a tool developed by HashiCorp that allows you to build your infrastructure using code.Terraform makes spinning up infrastructure less painful and making changes less scary. By describing infrastructure as code, spinning a new server turns into submitting a pull request and rolling back to a previous state of infrastructure becomes as easy as reverting a commit. Terraform is not limited to AWS, it can provision a whole suite of AWS products and can integrate with a growing list of providers including Digital Ocean, OpenStack and more.

Intro

Using Terraform as a systems developer is a good start for remodelling your infrastructure as code. However, to really scale you need to be able to have multiple people work on your terraform stacks. This problem is solved by Terraform’s remote state. Remote state allows you to store the state file for a stack in some third party storage provider so it can be shared across developers. This is in contrast to what you would have if you did not use remote state, that is, each developer with their own statefile. As you could imagine, this would get really messy as people start to clobber each other’s changes.

Terraform provides users with a couple of options when it comes to remote state backends including: S3, Consul and HTTP.
S3 is a particularly interesting backend to use since you can version the contents of buckets. Conceivably, then you could also version control states of your infrastructure.

Whats in the state file?

The state file contains information about what real resources exist for each object defined in the terraform config files. For example, if you have a DNS zone resource created in your terraform config, then the state file contains info about the actual resource that was created on AWS.

Here is an example of creating a DNS zone with Terraform along with its state file:


# example.tf
# create a DNS zone called example.com
resource "aws_route53_zone" "example_dns_zone" {
  name = "example.com"
}
# terraform.tfstate
# in the state file, the DNS zone ID along with its name is stored
"aws_route53_zone.example_dns_zone": {
    "type": "aws_route53_zone",
    "primary": {
       "id": "Z2D3OUXZHH4NUA",
       "attributes": {       
          "name": "example.com"
        }
     }
},

Store State Remotely in S3

If you are working on a team, then its best to store the terraform state file remotely so that many people can access it. In order to setup terraform to store state remotely you need two things: an s3 bucket to store the state file in and an terraform s3 backend resource.

You can create an s3 bucket in a terraform config like so:

# example.tf
provider "aws" {
  region = "us-west-2"
}
# terraform state file setup
# create an S3 bucket to store the state file in
resource "aws_s3_bucket" "terraform-state-storage-s3" {
    bucket = "terraform-remote-state-storage-s3"
 
    versioning {
      enabled = true
    }
 
    lifecycle {
      prevent_destroy = true
    }
 
    tags {
      Name = "S3 Remote Terraform State Store"
    }      
}

Then create the s3 backend resource like so:

# terraform.tf
terraform {
 backend “s3” {
 encrypt = true
 bucket = "terraform-remote-state-storage-s3"
 region = us-west-2
 key = path/to/state/file
 }
}

What is locking and why do we need it?

If the state file is stored remotely so that many people can access it, then you risk multiple people attempting to make changes to the same file at the exact same time. So we need to provide a mechanism that will “lock” the state if its currently in-use by another user. We can accomplish this by creating a dynamoDB table for terraform to use.

Create the dynamoDB table like this:

# example.tf
# create a dynamodb table for locking the state file
resource "aws_dynamodb_table" "dynamodb-terraform-state-lock" {
  name = "terraform-state-lock-dynamo"
  hash_key = "LockID"
  read_capacity = 20
  write_capacity = 20
 
  attribute {
    name = "LockID"
    type = "S"
  }
 
  tags {
    Name = "DynamoDB Terraform State Lock Table"

You will need to modify the Terraform S3 backend resource and add in the dynamoDB table:

# terraform.tf
terraform {
 backend “s3” {
 encrypt = true
 bucket = "terraform-remote-state-storage-s3"
 dynamodb_table = "terraform-state-lock-dynamo"
 region = us-west-2
 key = path/to/state/file
 }
}

Final step to push the state file:

Once you’ve created the S3 bucket and dynamoDB table, along with the backend S3 resource referencing those, then you can run your terraform configs like normal with terraform plan and terraform apply commands and the state file will show up in the s3 bucket. After those commands, if you inspect .terraform/terraform.tfstate, you will see that it contains the location of the state file now instead of the actual state file.

cat .terraform/terraform.tfstate
{
    "version": 3,
    "backend": {
        "type": "s3",
        "config": {
            "bucket": "terraform-remote-state-storage-s3",
            "dynamodb_table": "terraform-state-lock-dynamo",
            "encrypt": true,
            "key": "example/terraform.tfstate",
            "region": "us-west-2"
        }
    }
}

If you already have a local statefile you will probably want to push it up to S3. So, run:

terraform remote push

Now whenever you run a terraform plan or terraform apply the remote state will be pulled down to your local machine and you (probably) will not clobber another developer’s changes. Finally when you apply a change the resulting changes state will be uploaded to the remote server.

To pull changes from the remote state you can simply run:

terraform remote pull

Remote-State-Diagram
Remote State DiagramRemote state diagram

Learnings

  • Keep your remote states small

Don’t keep the state of your infrastructure in one giant statefile. This will slow down development time since only one person should be changing a statefile at a time. It also creates a tight coupling between all the parts of your infrastructure. At Hootsuite we are separating our state files by environment, service and project. So, each project within a service will have a statefile per environment. Whenever changes are made to a project’s infrastructure we can guarantee that only that project in a given environment will be modified. For projects that depend on other stacks, use outputs and include the statefile of the dependencies. There is some more interesting discussion and opinions linked below.

  • Use remote states opposed to local states

Local states that are stored in git are OK if you a small (less than 3 people) team. But this doesn’t scale as you grow the number of contributors. Using remote states is a best practice and ensures that there is one source of truth for the state of your infrastructure.

  • Lock remote states

Don’t have multiple people working on the same stack at the same time. This is a recipe for disaster. When one person is working on a stack it is guaranteed that the state will not change between planning and applying a change. However, when a second person is also working on the stack this is not guaranteed. The result is a mangled statefile and two sets of changes that don’t work. This can be avoided by using processes as simple as letting your team know you are making a change and they should not be touching the stack. Hashicorp takes a more rigorous approach to this problem with Atlas, which allows you to version control and lock states.

  • Have a process and use tools

Just like you probably have an organization wide standard process for building applications or modifying infrastructure, you should have a standardized process for Terraform. When starting with a small part of infrastructure and a small team it is probably OK to make everyone do their own config and run Terraform locally. However, as the team and infrastructure managed by terraform grows then managing stacks becomes more difficult. Using tools that allow you to setup remote states, to plan, and then to apply that plan with one command and no overhead of knowing all the details of the other terraform stacks makes terraform more accessible to other teams and helps relieve the worry that something will break.

Ref: https://medium.com/@jessgreb01/how-to-terraform-locking-state-in-s3-2dc9a5665cb6