Using GitHub Actions to automatically build and deploy an Elixir + Phoenix web app on AWS EC2 running Amazon Linux 2023

At a glance #

Below is a GitHub Action with two steps that builds an Elixir + Phoenix release binary using Docker and deploys the binary onto an AWS EC2 instance running Amazon Linux 2023. This approach worked fairly well for my hackathon project, but be warned that single-instance deployments like this are not horizontally scalable or highly available.

Why Elixir? #

I chose to build Backyard Birds in Elixir because Hackathons are a good opportunity to explore new technology. I first learned about Elixir back in 2019 and generally liked the functional syntax and “let it crash” ethos, centralizing error and retry logic inside a supervisor process.

Given 6 years had passed since I first tried Elixir, I assumed it would be smooth sailing. Initially setting up a Phoenix web app and running it locally was very easy. Installing dependencies using the Elixir build tool, mix, was just as simple as npm.

The first obstacle I faced was the lack of official Hex packages. Most notably, there was no AWS SDK. I choose to use aws-elixir which worked well enough, but it was poorly documented, didn’t feel very idiomatic, and annoying didn’t have separate packages for each service so I had to compile hundreds of modules.

Second, coming from TypeScript, I missed static types. jason made it easy to parse JSON responses, but I just got back tuples {:ok, %{}} with a Map as the I had to manually extract values from. Folks on the Elixir Discord were incredibly helpful and suggested I consider making structs to represent API responses. This was a bit cumbersome but gave me a bit more structural safety when handling various JSON payloads.

defmodule Yoto.Models.DisplayIcon do
  @moduledoc "Represents a display icon uploaded for a Card."

  @derive {Jason.Encoder, only: [:mediaId, :userId, :displayIconId, :url, :new]}
  defstruct [
    :mediaId,
    :userId,
    :displayIconId,
    :url,
    :new
  ]
end

I had mixed luck using LLMs like ChatGPT and agentic coding tools like Claude Code with Elixir. As a beginner, it robbed me of the experience (sometimes painful) that comes when you learn a new language. Worse, it often gaslit me by making up non-existent functions in the Elixir standard library or suggesting the language had features that it did not. I often got syntactically invalid “Elixir” from ChatGPT or Claude.

Deploying to AWS #

After much struggle, I finally had an Elixir app running locally that I was happy with. This is when things got surprisingly difficult. I develop on an M-series MacBook (arm64), and I wanted to build and deploy on AWS since I already had an account and domain name in Route53. Yet there’s surprisingly little prior art showing how to deploy Elixir on AWS EC2, and even less showing how to do automated, continuous deployments using GitHub Actions.

My preference was to automatically build my Elixir app an AWS EC2 instance with a Gravitron3 (arm64) CPU running Amazon Linux 2023 because this was the cheapest, and likely most secure, default configuration. With great help from the folks on Discord, I tried:

⛔ GitHub Action + arm64 = Dead End

Building on an arm64 runner (in Public Beta) using ubuntu-24.04-arm in the hopes of deploying to an EC2 instance with a Gravitron CPU. My Actions sat idle for a day before hitting a timeout. arm64 runners were never available to pick up my Action.

⛔ GitHub Action + Ubuntu (default) = Dead End

GitHub generously provides free compute for Actions which defaults to Ubuntu 22 running on an x86_64 CPU. Building directly within GitHub Actions and running on Amazon Linux 2023 resulted in a discrepancy between the glibc version used to compile the service and the server running it.

🏆 Build and deploy on the same environment

The best solution was to ensure the build and runtime environments matched as closely as possible. In order to do that, I used Docker to build from a amazonlinux:2023 base image, then download and install Elixir, Erlang, and OTP. Amazon Linux is based on Fedora, but Elixir and Erlang aren’t available via the dnf package manager using it’s default repositories so they had to be build from source and installed manually.

The resulting binary is then copied using Secure File Copy (SCP) onto my EC2 instance using a private key stored in GitHub Secrets. Then a script is run to create a systemd unit file, and the service is started.

This approach actually worked quite well for a single, small EC2 instance. Presumably there’s some downtime during deployments when the running service is stopped and the new service is started. This was too short to actually perceive, and more than acceptable for a hackathon project.

tmpfs Missteps #

I picked the smallest EC2 instance size because I didn’t expect significant traffic, and I reviewed my app as primarily I/O bound. However, I forgot that there was a single step in a workflow I wrote where ImageMagick is used to scale down and pixelate a 1024x1024 image from ChatGPT into a 16x16 pixel art icon for Yoto. All data in Elixir is immutable and passed via value. I was probably overly cautious, so to avoid keeping the PNG file in memory throughout the workflow, I instead wrote it to a temporary file and passed the file name instead.

A problem arose when temporary files appeared to not get deleted, and the partition ran out of disk space. Calls to the mogrify library (which in turn just runs system commands to System.cmd) failed due to insufficient space. After some research, I realized the default tmpfs size is half of physical RAM without swap. That didn’t leave a lot of space on my t3.nano instance with just 0.5GB of memory.

I ended up using PrivateTmp, a systemd setting for sandboxing, to create a new namespace for /tmp and /var/tmp. This seems like best practice, and make sure no other process was contributing to this issue. I also used the briefly library to ensure temporary files were deleted shortly after the workflow execution completed.

Cloudfront VPC Origin doesn’t play nice with WebSockets #

Perhaps the biggest dead end I faced was with AWS Cloudfront’s VPC Origin configurations. This let me set my EC2 instance, inside of a private subnet in a Virtual Private Cloud (VPC), as an origin server for a Cloudfront distribution. It seemed like a great solution: I could use ACM to auto-renew my SSL certificates, Route53 to manage the DNS entries, and I had a Content Delivery Network (CDN) in front of my small EC2 instance to reduce load via caching.

However, I quickly realized that my app didn’t work as expected because the Phoenix client couldn’t establish a WebSockets connection to my server. I enabled Cloudfront logging and saw the HTTP 502 Bad Gateway errors showed log entries for OriginDnsError in CloudWatch. I found several similar issues, and after some consultation on the AWS Discord, realized this was a fundamental limitation within Cloudfront that I could not overcome. In the end, I swapped WebSockets for HTTP polling during wait states when a Yoto card was being generated.

Lesson learned: CloudFront does not support WebSockets for VPC Origins pointing to an EC2 instance. It may work if you’re using a load balancer.

GitHub Action to Build + Deploy an Elixir service on AWS EC2 running AL2023 #

This GitHub Action uses Docker to build a release binary of an Elixir + Phoenix web app on Amazon Linux 2023 (AL2023). GitHub Container Registry (GHCR) is used to cache intermediate build steps to reduce subsequent build times. The second step uses SCP to copy a tarball of the binary onto an EC2 host, then uses Secure Shell (SSH) to start a systemd service of the Phoenix web app.

Secrets #

The following values need to be added to your GitHub repository under Settings > Secrets and variables > Actions.

Secret NameDescriptionExample Values
EC2_HOSTThe SSH host with port172.32.18.129 or ec2-172-32-18-129.us-east-2.compute.amazonaws.com
EC2_SSH_KEYThe private SSH key, in .pem file formatContents of your .pem key file
EC2_USERThe SSH userec2-user (default for Amazon Linux)
ENV_VARSEnvironment variables in .env formatDATABASE_URL=postgres://...
API_KEY=abc123
RELEASE_COOKIEThe release cookie for distributed ErlangValue from releases/COOKIE or generate with mix release.init
SECRET_KEY_BASEPhoenix secret key baseRun mix phx.gen.secret to generate
name: Build & Deploy Phoenix (linux/x86) to EC2

on:
  push:
    branches: [ "main" ]
  workflow_dispatch:

env:
  MIX_ENV: prod
  ELIXIR_VERSION: "1.18.4"        # adjust as needed
  OTP_VERSION: "27.3.4.2"         # match what you have in prod
  BASE_OTP_VERSION: "27"          # match what you have in prod
  APP_NAME: "myapp"               # lower_snake_case as created by mix new
  PHX_PORT: "4000"
  PHX_HOST: "example.com"

jobs:
  build:
    permissions:
      contents: read
      packages: write

    name: Build linux/x86 release
    runs-on: ubuntu-latest
    environment: prod

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
        with:
          buildkitd-flags: --debug
          install: true
          driver-opts: |
            image=moby/buildkit:buildx-stable-1
            network=host

      - name: Create optimized Dockerfile
        run: |
          cat <<EOF > Dockerfile.build
          FROM amazonlinux:2023 AS base

          # Install build tools and dependencies (cached layer)
          RUN dnf update -y && \
              dnf install -y \
              git tar gzip gcc gcc-c++ make wget \
              openssl-devel ncurses-devel libxslt libxml2 \
              openssl-static ncurses-static zlib-static \
              unzip libtool \
              --skip-broken && \
              dnf clean all

          # Download stage - separate for early failure and caching
          FROM base AS downloader
          ARG OTP_VERSION
          ARG ELIXIR_VERSION
          ARG BASE_OTP_VERSION
          
          # Download files in parallel where possible
          RUN echo "Downloading OTP \${OTP_VERSION} and Elixir \${ELIXIR_VERSION}..." && \
              wget -q --spider "https://github.com/erlang/otp/releases/download/OTP-\${OTP_VERSION}/otp_src_\${OTP_VERSION}.tar.gz" && \
              wget -q --spider "https://github.com/elixir-lang/elixir/releases/download/v\${ELIXIR_VERSION}/elixir-otp-\${BASE_OTP_VERSION}.zip" && \
              wget -q "https://github.com/erlang/otp/releases/download/OTP-\${OTP_VERSION}/otp_src_\${OTP_VERSION}.tar.gz" -O /tmp/otp_src.tar.gz & \
              wget -q "https://github.com/elixir-lang/elixir/releases/download/v\${ELIXIR_VERSION}/elixir-otp-\${BASE_OTP_VERSION}.zip" -O /tmp/elixir-precompiled.zip & \
              wait

          # OTP build stage - heavily optimized
          FROM downloader AS otp-builder
          COPY --from=downloader /tmp/otp_src.tar.gz /tmp/
          
          RUN echo "Building OTP with optimizations..." && \
              tar -xzf /tmp/otp_src.tar.gz -C /tmp && \
              cd /tmp/otp_src_* && \
              ./configure \
                --disable-jit \
                --disable-hipe \
                --disable-sctp \
                --disable-silent-rules \
                --enable-shared-zlib \
                --enable-threads \
                --with-ssl \
                --without-javac \
                --without-odbc \
                --without-wx && \
              make -j\$(nproc) && \
              make install && \
              rm -rf /tmp/*

          # Final runtime stage
          FROM base AS runtime
          
          # Copy OTP from builder stage
          COPY --from=otp-builder /usr/local /usr/local
          
          # Install Elixir from precompiled binary
          COPY --from=downloader /tmp/elixir-precompiled.zip /tmp/
          RUN unzip -q /tmp/elixir-precompiled.zip -d /usr/local/elixir && \
              ln -sf /usr/local/elixir/bin/elixir /usr/local/bin/elixir && \
              ln -sf /usr/local/elixir/bin/mix /usr/local/bin/mix && \
              ln -sf /usr/local/elixir/bin/iex /usr/local/bin/iex && \
              rm -f /tmp/elixir-precompiled.zip

          # Verify installations and pre-install common tools
          RUN erl -version && elixir --version && \
              mix local.hex --force && \
              mix local.rebar --force

          # Set up working directory
          WORKDIR /app
          EOF

      - name: Log in to GitHub Container Registry
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
          logout: false

      - name: Build optimized container with caching
        run: |
          REPO_LC=$(echo '${{ github.repository }}' | tr '[:upper:]' '[:lower:]') && \
          docker buildx build \
            --build-arg OTP_VERSION=${{ env.OTP_VERSION }} \
            --build-arg ELIXIR_VERSION=${{ env.ELIXIR_VERSION }} \
            --build-arg BASE_OTP_VERSION=${{ env.BASE_OTP_VERSION }} \
            --cache-from="type=registry,ref=ghcr.io/${REPO_LC}/builder-cache:latest" \
            --cache-to="type=registry,ref=ghcr.io/${REPO_LC}/builder-cache:latest,mode=max" \
            --load \
            -t amazon-linux-builder \
            -f Dockerfile.build .

      - name: Create artifact directory
        run: mkdir -p release_out

      - name: Build Phoenix Release with optimizations
        run: |
          docker run --rm \
            -v ${{ github.workspace }}:/app \
            -v $PWD/release_out:/app/release_out \
            -e MIX_ENV=${{ env.MIX_ENV }} \
            -e SECRET_KEY_BASE=${{ secrets.SECRET_KEY_BASE }} \
            -e RELEASE_COOKIE=${{ secrets.RELEASE_COOKIE }} \
            -e ELIXIR_ERL_OPTIONS="+fnu" \
            --workdir /app \
            amazon-linux-builder \
            /bin/bash -c "
              set -euo pipefail
              
              # Ensure release_out directory exists inside container
              mkdir -p /app/release_out
              
              echo 'Getting dependencies...'
              mix deps.get --only \${MIX_ENV}
              
              echo 'Compiling with optimizations...'
              # Compile with multiple jobs and optimizations
              MIX_ENV=\${MIX_ENV} mix compile --force --all-warnings

              echo 'Building release...'
              MIX_ENV=\${MIX_ENV} mix release --overwrite
              
              echo 'Packaging artifact...'
              tar -C _build/\${MIX_ENV}/rel/${{ env.APP_NAME }} -czf /app/release_out/${{ env.APP_NAME }}.tar.gz .
              
              echo 'Build complete!'
            "

      - name: Upload artifact
        uses: actions/upload-artifact@v4
        with:
          name: ${{ env.APP_NAME }}-x86-release
          path: release_out/${{ env.APP_NAME }}.tar.gz
          retention-days: 7

    outputs:
      app_name: ${{ env.APP_NAME }}

  deploy:
    name: Deploy to EC2
    runs-on: ubuntu-latest
    needs: build
    environment: prod
    if: github.ref == 'refs/heads/main'

    env:
      APP_NAME: "${{ needs.build.outputs.app_name }}"
      REMOTE_DIR: "/opt/${{ needs.build.outputs.app_name }}"
      SERVICE_NAME: "${{ needs.build.outputs.app_name }}"
      ENV_FILE: "/etc/${{ needs.build.outputs.app_name }}.env"

    steps:
      - name: Download artifact
        uses: actions/download-artifact@v4
        with:
          name: ${{ env.APP_NAME }}-x86-release
          path: .

      - name: Copy release to EC2
        uses: appleboy/scp-action@v0.1.7
        with:
          host: ${{ secrets.EC2_HOST }}
          username: ${{ secrets.EC2_USER }}
          key: ${{ secrets.EC2_SSH_KEY }}
          source: "${{ env.APP_NAME }}.tar.gz"
          target: "/tmp/"
          debug: true

      - name: Configure & restart service on EC2
        uses: appleboy/ssh-action@v1.0.3
        with:
          host: ${{ secrets.EC2_HOST }}
          username: ${{ secrets.EC2_USER }}
          key: ${{ secrets.EC2_SSH_KEY }}
          script: |
            set -euo pipefail

            APP_NAME="${{ env.APP_NAME }}"
            REMOTE_DIR="${{ env.REMOTE_DIR }}"
            SERVICE_NAME="${{ env.SERVICE_NAME }}"
            ENV_FILE="${{ env.ENV_FILE }}"
            TARBALL="/tmp/${APP_NAME}.tar.gz"

            # Create app directory and user if needed
            if ! id -u "$APP_NAME" >/dev/null 2>&1; then
              sudo useradd -r -s /bin/false -M "$APP_NAME"
            fi
            sudo mkdir -p "$REMOTE_DIR"
            sudo chown -R "$APP_NAME":"$APP_NAME" "$REMOTE_DIR"

            # Write/refresh environment file (runtime config)
            sudo bash -c "cat > $ENV_FILE" << 'EOF'
            MIX_ENV=prod
            PHX_SERVER=true
            PORT=${{ env.PHX_PORT }}
            PHX_HOST=${{ env.PHX_HOST }}
            SECRET_KEY_BASE=${SECRET_KEY_BASE}
            RELEASE_COOKIE=${RELEASE_COOKIE}
            ${{ secrets.ENV_VARS }}
            EOF

            # Substitute secrets into env file
            sudo sed -i "s|\${SECRET_KEY_BASE}|${{ secrets.SECRET_KEY_BASE }}|g" "$ENV_FILE"
            sudo sed -i "s|\${RELEASE_COOKIE}|${{ secrets.RELEASE_COOKIE }}|g" "$ENV_FILE"
            sudo sed -i "s|\${PHX_HOST}|${{ env.PHX_HOST || 'localhost' }}|g" "$ENV_FILE"
            sudo sed -i "s|\${PORT}|${{ env.PHX_PORT || '80' }}|g" "$ENV_FILE"

            # Lock down environment file
            sudo chown root:root "$ENV_FILE"
            sudo chmod 600 "$ENV_FILE"

            # Install or refresh systemd service
            SERVICE_FILE="/etc/systemd/system/${SERVICE_NAME}.service"
            sudo bash -c "cat > $SERVICE_FILE" << EOF
            [Unit]
            Description=${APP_NAME} Phoenix app
            After=network-online.target
            Wants=network-online.target

            [Service]
            Type=simple
            User=${APP_NAME}
            Group=${APP_NAME}
            EnvironmentFile=${ENV_FILE}
            WorkingDirectory=${REMOTE_DIR}
            ExecStart=${REMOTE_DIR}/bin/${APP_NAME} start
            ExecStop=${REMOTE_DIR}/bin/${APP_NAME} stop
            Restart=always
            RestartSec=5
            PrivateTmp=true
            LimitNOFILE=65536

            [Install]
            WantedBy=multi-user.target
            EOF

            sudo systemctl daemon-reload
            sudo systemctl enable "${SERVICE_NAME}"

            # Deploy new release
            TMP_DIR="$(mktemp -d)"
            sudo tar -C "$TMP_DIR" -xzf "$TARBALL"
            if [ -d "${REMOTE_DIR}/var" ]; then
              sudo rsync -a "${REMOTE_DIR}/var/" "${TMP_DIR}/var/" || true
            fi
            sudo rsync -a --delete "${TMP_DIR}/" "${REMOTE_DIR}/"
            sudo chown -R ${APP_NAME}:${APP_NAME} "${REMOTE_DIR}"

            # Clean old logs for a fresh journal view
            sudo systemctl stop "${SERVICE_NAME}"
            sudo journalctl --rotate
            sudo journalctl --vacuum-time=1s

            # Restart service with clean logs
            sudo systemctl start "${SERVICE_NAME}"
            sudo systemctl status "${SERVICE_NAME}" --no-pager -l || true

            echo "Deployment complete."

            # Log service status directly
            sudo journalctl -u "${SERVICE_NAME}" -n 20 --since "10 second ago"