Reproducible Demo of Traefik 2.0 Traffic Mirroring on EKS

Reproducible Demo of Traefik 2.0 Traffic Mirroring


What to expect in this doc:

  • Traefik 2.0 has traffic mirroring functionality that should work on generic Kubernetes, but there's no good how-to guides, let this be the first.
  • This is a how-to guide, that's optimized for understanding
  • I'll cover setup, elaborate on useful background contextual information, and troubleshooting info.
  • Test driven development is best, so there will be a test that proves without a doubt that the setup works as intended.

Shout-out to DoiT International

  • One nice thing about working at DoiT is that we're encouraged to learn and occasionally lab things out to help go above and beyond when supporting our customers as well as given some time to do so.
  • This guide was the result of that practice and was made to help support a doit customer.

Prep Work not covered

  • You need to provision a Kubernetes Cluster. I'm using a generic EKS cluster with default settings (EX: it uses the legacy LB controller, vs the aws-load-balancer-controller add-on.)
  • You need to own (~$12) or have admin access to a DNS name (for easy HTTPS certificate generation) (I'll be using the DNS name
  • The instructions and configuration assume DNS name If you change that to your own, you'd need to update the config files to your DNS name.
  • Assumptions I'm making:
    • You have access to Unix bash/zsh terminal
    • Common CLI tools like docker, kubectl, and helm are pre-installed
    • Your ~/.kube/config context is pointing to the right cluster
    • Bash# code file.yaml
      ^-- That you understand this convention to mean edit file.yaml I'm using, that I configured to be runnable from the CLI, by following a random guide on the internet that involved adding
      export PATH="$PATH:/Applications/Visual Studio" to my Mac's ~/.zshrc file.
      You don't have to do this, you could replace code in code file.yaml with vi file.yaml, nano file.yaml, etc.
    • I assume you're doing this lab in an isolated sandbox AWS account
      Don't try this lab in a staging/prod AWS account, ideally it should be done in an isolated sandbox AWS account vs a shared dev account. That said it's safe if you follow directions. Remember it's a general best practice to do any random how-to guide on the internet in an isolated sandbox AWS account for defense in depth reasons. (As this would isolate the blast radius of the scenario were something bad happens due to a mistake you made following the how-to guide or if you followed a poisoned how-to guide where someone tricks you into running an exploit payload.)
      see step 4/5 security awareness notice for specific info on how to maximize safety.

Step 1: Install traefik helm chart

  1. Review traefik.helm-values.yaml, It's 48 lines and contains useful background contextual information as comments. Also, it has an annotation that gets applied to Kubernetes service of type LB, that's specific to AWS, so if you wanted to try to use this steps outside of AWS you may need to edit annotations to provision a CSP L4 LB. (Cloud Service Provider Layer 4(TCP) Load Balancer.)

  2. Install the helm chart, aws users should be able to copy and paste as is.

helm repo add traefik
helm repo update
mkdir -p ~/traefik-lab
cd ~/traefik-lab
curl > traefik.helm-values.yaml
head traefik.helm-values.yaml

ls -la

helm upgrade --install traefik traefik/traefik --version 21.1.0 --values=traefik.helm-values.yaml --namespace=traefik --create-namespace=true

kubectl get pods -n=traefik

Step 2: Provision an HTTPS wildcard cert from Let's Encrypt

Background Context:

  • Let's Encrypt is a non-profit Free Internet Certificate Authority.
  • We'll use an interactive shell to provision the HTTPS cert using a generic methodology
  • The wildcard cert we generate will work for and https://* (but not subdomains)
  • You'll need to replace with a domain name you own / control.

Step 2 Instructions:

  1. Start the cert provisioning process
# [admin@workstation:~/traefik-lab]
mkdir -p ~/traefik-lab/cert
cd ~/traefik-lab/cert

docker run -it --entrypoint=/bin/sh --volume $HOME/traefik-lab/cert:/.lego/certificates
# [shell@dockerized-ACME-client:/]
# (Note: /lego is intentional, lego alone will say lego not found in path)
/lego --email "" --domains="*" --dns "manual" run
  1. The terminal will say something along the lines of
Do you accept the TOS? Y/n
2023/03/16 15:05:58 [INFO] [*] acme: Obtaining bundled SAN certificate
2023/03/16 15:05:59 [INFO] [*] AuthURL:
2023/03/16 15:05:59 [INFO] [*] acme: use dns-01 solver
2023/03/16 15:05:59 [INFO] [*] acme: Preparing to solve DNS-01
lego: Please create the following TXT record in your zone: 120 IN TXT "WKnGHot_TzrzkKIwMpLwrymEZr6m3ZyQEQsEcG5C4Bo"
lego: Press 'Enter' when you are done
  1. Manually update your authoritative DNS nameserver. (In my case I went to my domain registrar, clicked on the entry and verified it was configured to use google's domain servers vs delegated ("Your domain is using Google Domains name servers"), then I created a custom TXT record
    _acme-challenge TXT 300 WKnGHot_TzrzkKIwMpLwrymEZr6m3ZyQEQsEcG5C4Bo

  2. Once done, I pressed enter in the terminal that was waiting for human input. Within a minute the DNS update had finished propagating (If slow, you can speed it up by setting your laptop's DNS to match, = Google DNS, = Cloudflare DNS.)

  3. The terminal will say something along the lines of

2023/03/16 15:23:13 [INFO] [*] acme: Waiting for DNS record propagation.
2023/03/16 15:23:20 [INFO] [*] The server validated our request
2023/03/16 15:23:20 [INFO] [*] acme: Cleaning DNS-01 challenge
lego: You can now remove this TXT record from your zone: 120 IN TXT "..."
2023/03/16 15:23:20 [INFO] [*] acme: Validations succeeded; requesting certificates
2023/03/16 15:23:21 [INFO] [*] Server responded with a certificate.

Step 3: Generate and apply a HTTPS secret

Use the cert files to generate and apply a kube secret containing an HTTPS wildcard cert

# [shell@dockerized-ACME-client:/]

# [admin@workstation:~/traefik-lab/cert]

cd ~/traefik-lab

export B64_CERT=$(cat ~/traefik-lab/cert/ | base64)
export B64_KEY=$(cat ~/traefik-lab/cert/ | base64)
echo "$B64_CERT"
echo "$B64_KEY"
# ^-- Looking for a really long string of gibberish
#     Basically left shifted smoke test b4 moving on

tee https-wildcard-cert.yaml  << EOF
apiVersion: v1
kind: Secret
  name: https-wildcard-cert # covers both AND https://*
  namespace: traefik
  tls.crt: $B64_CERT
  tls.key: $B64_KEY

cat https-wildcard-cert.yaml

kubectl apply -f https-wildcard-cert.yaml

Step 4: Make sure you read step 5's IMPORTANT: security concern carefully

Step 5: Apply config to expose traefik dashboard via HTTPS and authn

  1. Grab Example file
  2. modify as needed in terms of DNS names
    Change the username and password, using the method mentioned in the yaml comment if you leave it default and someone logs in to your traefik instance. In theory a bad actor could do something along the lines of install a plugin that could in theory be used to privilege escalate to kubectl cluster-admin, and your EKS cluster probably has some IAM rights for the AWS account it resides in, that could be used to privilege escalate into an AWS account. (This is why isolated sandbox/dev aws accounts are a known best practice.) I immediately shut down my cluster after going live with this for that reason.
  4. make sure you edit the username/password in the file before applying as long as you do that it's safe security wise, it's extremely dangerous to apply without 1st editing the username/password to a secure value.
  5. kubectl apply
# [admin@workstation:~/traefik-lab]
curl > traefik_dashboard_and_default_tls.yaml
head traefik_dashboard_and_default_tls.yaml

code traefik_dashboard_and_default_tls.yaml
# ^-- update username password and DNS name from to yours as needed
#     and read any notes about background contextual info useful to understanding
kubectl apply -f traefik_dashboard_and_default_tls.yaml

Step 6: Update DNS then visit site

  1. kubectl get services -n=traefik
NAME      TYPE           CLUSTER-IP     EXTERNAL-IP                                                                     PORT(S)                      AGE
traefik   LoadBalancer   80:30116/TCP,443:30321/TCP   175m
  1. I see my L4 LB has a CNAME of
  2. In (the spot you update internet DNS is likely different) I added a custom record like this * CNAME 300
  3. Bash# nslookup

Non-authoritative answer:	canonical name =
  1. Since that looks good I visit the website to verify
  2. I see no HTTPS errors and I get prompted for a username and password
    I copy and paste the values embedded in the earlier config yaml file
    Username: et9B6fBZUYeDOmzvukiquYw5KrCqKy
    Password: QREqS/DVPF6dye3/FE30UCS8S5pwID
    (Chrome will cache them so the gibberish passwords won't be too annoying since I won't need to enter them every time.)

You can expect to see something like this image


Step 7: Deploy blue and green version of stateful mock application

  • Memory backed redis makes the app stateful (this will be useful in validating traffic mirroring later)
  • redis is useful for testing as we can easily factory reset the state by rebooting the redis pod
  • otherwise it gives us statefulness with minimal dependencies / configuration.
# [admin@workstation:~/traefik-lab]
curl > blue.helm-values.yaml
curl > green.helm-values.yaml
head green.helm-values.yaml

# Inspect values and update dns name as needed
code blue.helm-values.yaml
code green.helm-values.yaml

helm upgrade --install podinfo oci:// --values=blue.helm-values.yaml --namespace blue --create-namespace=true
helm upgrade --install podinfo oci:// --values=green.helm-values.yaml --namespace green --create-namespace=true

The 2 websites and look like image

Step 8: Deploy Traefik Mirror

# [admin@workstation:~/traefik-lab]
curl > green-with-traffic-mirroring-to-blue.yaml

# Inspect file / edit DNS names as needed. The only object that should
# need to be updated is the IngressRoute custom resource object near the end.
code green-with-traffic-mirroring-to-blue.yaml

kubectl apply -f green-with-traffic-mirroring-to-blue.yaml
Now shows the green website (and behind the scenes is mirroring incoming traffic to the blue website)

Notes of interest:

  • One of the great things about Traefik is that it's written in Go, so rarely has CVEs. Meaning, there's less risk relative to alternative options, when implementing the pattern of configure it once, then don't update for a really long time.
  • Traefik 2.0 is finicky in terms of Kubernetes
    • lack of solid Kubernetes specific docs
    • poor UX (User Experience) of configuring traefik using Kubernetes CRs (custom resources) lack of custom resource config validation, results in a poor feedback loop when config is invalid. If your syntax is slightly off, or you're missing a value, a traefik specific Kubernetes Custom Resource Object's config might not get loaded into traefik, which you can tell by observing the traefik dashboard between changes. There are some edge cases where you'd need to reboot traefik pod for removed Kubernetes objects to be removed from traefik.
  • Traefik 3.0 is working to improve the Kubernetes UX (User Experience)

Explanation of the YAML config file:

  • DNS exists at multiple levels.
    • Inner Cluster DNS: are DNS names resolvable only by pods / workloads running in the cluster.
    • LAN/VPC DNS: are DNS names that are resolvable only by VMs on the LAN / in the VPC.
    • Internet DNS: are DNS names resolvable by any machine on the internet.
    • pods can resolve all 3
  • This solution leverages Kubernetes services of type ExternalName
    • Kubernetes services generate inner cluster DNS names
      Usually of the form $SERVICE_NAME.$NAMESPACE_NAME.svc.cluster.local
    • ExternalName services are similar to DNS CNAMES / think of them as DNS based redirects
  • primary-route service in the mirror namespace:
    creates inner cluster dns name primary-route.mirror (short for FQDN: primary-route.mirror.svc.cluster.local)
    that redirects to (which is an inner cluster fully qualified domain name, but could easily be updated to a public internet dns name)
  • mirror-route service in the mirror namespace:
    creates inner cluster dns name mirror-route.mirror (short for FQDN: mirror-route.mirror.svc.cluster.local)
    that redirects to (which is an inner cluster fully qualified domain name, but could easily be updated to a public internet dns name)
  • The reason services of type externalName are used is so this how-to guide acts as an exemplar (ideal example) to show the most flexible implementation option.
    • This config allows us to mirror traffic to kube service's in other namespaces
    • This config also allows us to mirror traffic to generic dns names on the internet, such as one hosted on an external eks cluster. (you could mirror traffic to the DNS name of a staging cluster for example)
    • The TraefikService (which is a Kubernetes Custom Resource that allows you to configure traffic mirroring in Traefik)
      seems to have a limitation where it only allows mirroring to traffic within the same namespace where the TraefikService exists. (using a DNS redirect, via service of type ExternalName, works around this limitation.)
  • Pointing out an interesting oddity
    • The TraefikService named primary-route-with-mirror points to port 9898
    kubectl get service -n=mirror
    # NAME            TYPE           CLUSTER-IP   EXTERNAL-IP                       PORT(S)   AGE
    # mirror-route    ExternalName   <none>    <none>    9h
    # primary-route   ExternalName   <none>   <none>    9h
    # ^-- I gave them port none, b/c the yaml value wouldn't be respected anyways, it's the service they point to
    #     that decides the port listened on
    # The following can be used to verify the ExternalName kubernetes services do in fact work
    kubectl run -it curl -- sh &
    kubectl exec -it curl -- curl -t0 primary-route.mirror:9898
    kubectl exec -it curl -- curl -t0 mirror-route.mirror:9898
    # ^-- you'll get feedback that they work
    kubectl delete pod curl
  • What is a traefik traffic mirror? / the nature of a traffic mirror? How to interpret the mirrors part of the yaml?
    kind: TraefikService 
      name: primary-route-with-mirror
      namespace: mirror
        kind: Service
        name: primary-route
        port: 9898
        - kind: Service
          name: mirror-route
          port: 9898
          percent: 100
    • ^-- Note the original YAML has more comments, for this section I want to elaborate on how to interpret the mirrors list.
    • There can only be 1 primary service that a mirror points to. (the primary service has bidirectional communication with the client) so when you run curl it's ONLY the primary service ( that responds back to the client.
    • You can have multiple mirrors (- denotes a yaml array list), the mirrors get unidirectional communication with the calling client. So when the client runs curl the mirror ( will receive traffic generated by the client, but blue won't be allowed to respond back to the client. (This should make sense as blue represents WIP code that we want to test, so you wouldn't want potentially buggy code being able to respond back to client users.)

Step 9: Test the Mirror / Verify it's working

  1. Set Variable to correct test value
    echo $DOMAIN
  2. Add Data to stateful endpoints
    curl -X POST -d "green persistence test value" https://green.$DOMAIN/cache/test-key
    curl -X POST -d "blue persistence test value" https://blue.$DOMAIN/cache/test-key
  3. Fetch Data from stateful endpoints
    curl https://green.$DOMAIN/cache/test-key
    (Returns: green persistence test value)
    curl https://blue.$DOMAIN/cache/test-key
    (Returns: blue persistence test value)
  4. Use Mirror to push data to both stateful endpoints at the same time
    curl -X POST -d "mirrored data push" https://green-with-traffic-mirroring-to-blue.$DOMAIN/cache/mirror-test
  5. Fetch Data to validate if mirror worked
    curl https://green.$DOMAIN/cache/mirror-test
    (Returns: mirrored data push)
    curl https://blue.$DOMAIN/cache/mirror-test
    (Returns: mirrored data push)

Here's what the final traefik dashboard of the working setup looks like if you're curious




# helm values for blue's install
# helm upgrade --install podinfo oci:// --values=blue.helm-values.yaml --namespace blue --create-namespace=true
replicaCount: 1
logLevel: info
color: "#0000FF" # blue
message: "Greetings from blue app!"
enabled: true
enabled: true
className: "traefik"
- host:
- path: /
pathType: ImplementationSpecific
# mentions TraefikService CR is for mirroring
apiVersion: v1
kind: Namespace
name: mirror
# In this example
# green will represent production (intended live)
# blue will represent staging (work in progress / future version)
# For traefik mirror:
# traffic will go to primary route of green AND green will respond to client
# traffic will be mirrored to blue AND blue will not respond to client as this is the nature of a mirror / helps test safely
apiVersion: v1
kind: Service
name: primary-route
namespace: mirror
type: ExternalName #v-- ExternalName must use FQDN
externalName: #Comes from Kubernetes inner cluster DNS convention of $SERVICE_NAME.$NAMESPACE_NAME.svc.cluster.local
# ports are purposefully missing as externalName doesn't respect their config, it's the service externalName points to that decides port
apiVersion: v1
kind: Service
name: mirror-route
namespace: mirror
type: ExternalName #v-- ExternalName must use FQDN
externalName: #Comes from Kubernetes inner cluster DNS convention of $SERVICE_NAME.$NAMESPACE_NAME.svc.cluster.local
# ports are purposefully missing as externalName doesn't respect their config, it's the service externalName points to that decides port
kind: TraefikService
name: primary-route-with-mirror
namespace: mirror
kind: Service # referring to a Kubernetes Service
name: primary-route #<-- "must" reference a pre-existing object (of kind: Service) in the same namespace
port: 9898 # Required value
mirrors: # The mirror "might" be able to exist in an external namespace, but using ExternalName service in same namespace gives best flexibility
- kind: Service
name: mirror-route
port: 9898 # Required value
percent: 100
# Note: traefik 2.0 CRDs are very finicky
# If your syntax is slightly off, or you're missing a value like a port
# It won't show up in traefik dashboard GUI / the config will be silently ignored
# It's critical you have access to that to debug this stuff.
kind: IngressRoute
name: primary-route-with-mirror
namespace: mirror
- websecure
- kind: Rule
match: Host(``)
- kind: TraefikService
name: primary-route-with-mirror
namespace: mirror
# helm values for green's install
# helm upgrade --install podinfo oci:// --values=green.helm-values.yaml --namespace green --create-namespace=true
replicaCount: 1
logLevel: info
color: "#00FF00" # green
message: "Greetings from green app!"
enabled: true
enabled: true
className: "traefik"
- host:
- path: /
pathType: ImplementationSpecific
# helm values for traefik helm chart install
# helm upgrade --install traefik traefik/traefik --version 21.1.0 --values=traefik.values.yaml --namespace=traefik --create-namespace=true
ingressClass: # Create a IngressClass for Traefik
enabled: true
isDefaultClass: true # and make it the default IngressClass
fallbackApiVersion: v1
enabled: false
# ^-- This is just disabling helm creating a traefik dashboard.
# In a future step, I'll create it manually outside of helm's
# logic to gain more control over the configuration.
# Mine will be exposed to the internet, but protected by
# authentication via a username and password.
annotations: # EKS / Kube on AWS config for L4 LB -----v "nlb"
redirectTo: websecure #Helps make an error message more obvious
# Context:
# curl
# shorthand for curl
# now gives "Moved Permanently" when before it gave 404 not found
# which was a little confusing as the web browser worked
# curl also worked
kubernetesCRD: #
allowCrossNamespace: true # IngressRoute resources able to reference other namespaces
allowExternalNameServices: true # another variation of cross namespace reference
allowEmptyServices: true # helps iteratively debug non-working configuration
enabled: true
allowExternalNameServices: true # makes cross namespace routing easier
allowEmptyServices: true # helps iteratively debug non-working configuration
# Note: If you do iterative debugging with allow empty on. traefik's
# dashboard won't clean up deleted CRDs until the traefik pods
# get rebooted so if the GUI of the traefik dashboard looks
# off consider deleteing / rebooting the traefik pods.
enabled: true
namespaced: false # works clusterwide
apiVersion: v1
kind: Secret
name: basic-auth-creds
namespace: traefik
stringData: # v-- generated using 'head -c 100 /dev/urandom | base64 | head -c 30'
username: et9B6fBZUYeDOmzvukiquYw5KrCqKy
password: QREqS/DVPF6dye3/FE30UCS8S5pwID
# ^-- You'd want this to be secret / not posted on the public internet
# I'm sharing a randomly generated secret of a throwaway sandbox environment
kind: Middleware
name: basic-auth
namespace: traefik
secret: basic-auth-creds
kind: IngressRoute
name: dashboard
namespace: traefik
# spec.entryPoints full list is available here -->
# web = port 80
# websecure = port 443
- websecure
- kind: Rule #Rule says dashboard reachable from using basic-auth
match: Host(``)
- name: api@internal #Referring to a built-in implicit default value that exists in
kind: TraefikService #traefik's dashboard, not a Kube Custom Resource of type TraefikService
- name: basic-auth
namespace: traefik
# v-- This Traefik Custom Resource tells traefik about our wildcard cert and says use as default
kind: TLSStore
name: default
namespace: traefik
- secretName: https-wildcard-cert # kube tls secrets existing in the same namespace
secretName: https-wildcard-cert
