Personal Cloud Storage with AWS EFS and Site to Site VPN

Personal Cloud Storage with AWS EFS and Site to Site VPN

Let's extend our local network to the cloud by setting up a site to site VPN with Amazon Web Services (AWS), and make file sharing easy between remote and on-premises infrastructure with Elastic File System (EFS). In addition to an AWS account, we need a IPSec capable router. I'm using a Netgate 2100 with pfsense+. We also need some kind of workaround if our home network lacks a static IP address.

All cloud services will be managed with Terraform: https://github.com/mwgolden/site-to-site-vpn

The initial design is simple and quite easy to set up. Reference VPC and VPN modules in the source code. We need the following:

  1. VPC with customer gateway, virtual private gateway, and VPN connection defined.
  2. An EFS resource, and updated VPC route table that opens TCP port 2049 for NFS.

Once a connection is established, we can mount the efs drive like so

sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport mount-target-ip:/ ~/efs-mount-point

Here is where we run into some friction: Amazon requires one of the following for a site to site VPN:

  1. Static IP address for the customer gateway
  2. AWS Private Certificate Authority - En lieu of a static IP address, we can use the AWS Private CA to issue a certificate to your gateway device. This is cost prohibitive, as AWS charges $400/mo for this service.

A third option: The netgate router has a built in dynamic dns service with custom configuration that let's us send an updated IP address in a GET request to an api endpoint. We can define our own serverless dynamic ip service.

The Terraform code creates a dynamic ip service that caches the current ip address in a file on s3. It creates an api gateway endpoint that invokes a lambda function. The lambda checks the cached ip address with the ip in the url query string. If the ip address is different, the lambda publishes an event to eventbridge.

There are two eventbridge targets: AWS Cloudwatch Logs, and an ECS task. The ECS task runs a Terraform command to replace the vpn connection with the updated ip address. Unfortunately, we can't simply update the customer gateway address. It must be replaced, and the entire VPN connection must be replaced with it.

terraform apply -replace="module.vpn.aws_customer_gateway.customer_gateway" -var-file="variables.tfvars" -auto-approve

Plan: 1 to add, 1 to change, 1 to destroy. module.vpn.aws_customer_gateway.customer_gateway: Destroying... [id=cgw-0c945335d165fdcf5] ╷ │ Error: deleting EC2 Customer Gateway : operation error EC2: DeleteCustomerGateway, https response error StatusCode: 400, RequestID: 2d42d0d3-34e0-410b-be2d-5836e360e4b0, api error IncorrectState: The customer gateway is in use.

Replacing the VPN connection requires updating the pre-shared key and remote gateway on the on premises router. I'm sure this could be automated if needed.