BoxBoat Blog

Service updates, customer stories, and tips and tricks for effective DevOps

x ?

Get Hands-On Experience with BoxBoat's Cloud Native Academy

Secure Kubernetes Microservices Communication with Istio and OPA

by Zach Yonash | Wednesday, May 18, 2022 | Security Microservices SPIFFE OPA Istio

The cybersecurity landscape has been rapidly evolving in recent years. Many companies have moved well past cloud adoption and are now fully utilizing a hybrid of cloud-native and on-premises technologies, prompting the need for a variety of new security measures to ensure critical workloads aren't compromised. One of the core tenets of zero trust is workload identity. Under the zero trust mindset, verifiable identification between each of your microservices needs to be mutual (see: Mutual TLS). How can this be achieved when managing heterogeneous services deployed to wildly different systems, built using different programming languages?

If you've been involved in solving aspects of this problem in the past, you've probably turned to a service mesh. A number of service meshes exist to tackle a wide range of issues with heterogeneous service-to-service communication, but arguably the most popular of the bunch is Istio. Back in 2019, BoxBoat put out a blog post describing the fundamentals of Istio, and these concepts are still very relevant today, though Istio has added a number of useful features since then.

Service meshes can be fairly complicated in different ways, and Istio is no exception. However, it does provide a number of useful abstractions that allow engineers to more easily solve various challenges that they'll face when tackling service communication. In keeping with the theme of workload identity and mutually secure communication, let's take a look at how you can leverage one of these abstractions: authorization policies. The goal here will be to craft an authorization policy that offloads external authorization to OPA (Open Policy Agent), which we can use in conjunction with the SPIFFE-compliant identities that our workloads receive as part of Istio's built-in certificate authority to make authorization decisions.

Quick notes on:

SPIFFE

At the heart of a modern, secure microservice platform should be a method of performing robust identity checks between services. SPIFFE (Secure Production Identity Framework for Everyone) helps solve this by providing specifications that can be leveraged to provide mutual identity. On its own, SPIFFE provides key specifications and concepts for workload attestation, but SPIFFE needs to be implemented by a tool in order for services to have real-world use.

There are a handful of popular tools that implement SPIFFE. You usually don't see SPIFFE without some mention of SPIRE (the SPIFFE Runtime Environment), mainly because it's a reference implementation of SPIFFE created by the SPIFFE Project themselves. For the purposes of this writing, we're going to use Istio's implementation of SPIFFE via X.509 certificates generated by the istiod CA.

OPA

OPA (Open Policy Agent) is a policy engine that abstracts decision-making from your underlying software. In this use case, we're using OPA as an external authorization service to make decisions on when to allow traffic to make it to our application through the Istio proxy sidecar.

Prerequisites

In order to jump into a working example, let's get a few prerequisites out of the way.

A local/sandbox k8s environment

I personally use Rancher Desktop on macOS which, under the hood, runs k3s on a virtual machine. Feel free to use what you're most comfortable with, but make sure you've met the platform requirements to install Istio. In my case, I simply bumped Rancher Desktop's memory to 8GB and CPU to 4GB. Keep in mind that the latest version of Istio only runs on Kubernetes versions 1.20, 1.21, 1.22 and 1.23.

Install Istio

In my case, I preferred to install Istio using Helm. Follow these instructions to install both the base and discovery charts. Verify that your helm status output looks like this:

helm status istiod -n istio-system
NAME: istiod
LAST DEPLOYED: Tue Apr 19 10:27:49 2022
NAMESPACE: istio-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
"istiod" successfully installed!

Let's get into it!

We're going to:

  • Create a frontend and backend namespace to hold our apps
  • Configure an AuthorizationPolicy CRD to use a CUSTOM action to offload authz to OPA
  • Configure a gRPC external authz provider inside the Istio mesh config
  • Deploy an OPA policy used by the destination workload to check against the source workload's SVID
  • Deploy two basic applications (httpbin), the destination including an OPA sidecar
  • Test out a few simple HTTP requests to verify the policy is working correctly

Namespace prerequisites

Let's create two namespaces and label them for Istio sidecar injection:

kubectl create namespace frontend
kubectl create namespace backend
kubectl label namespace frontend istio-injection=enabled
kubectl label namespace backend istio-injection=enabled

Authorization Policy

cat > auth_policy.yaml <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: ext-authz
 namespace: backend
spec:
 selector:
   matchLabels:
     app: backend
 action: CUSTOM
 provider:
   name: "my-custom-authz"
 rules:
 - to:
   - operation:
       methods: ["GET", "POST"]
EOF
kubectl apply -f auth_policy.yaml

The above policy will route an authorization request to a provider (in our case, OPA) that we will define shortly. It will only invoke the provider if the incoming request is a GET or POST method. As you can see from the selector, it's going to apply only to the backend app.

gRPC External Authorization Provider

Let's edit in-place the default Istio mesh config…

kubectl edit configmap istio -n istio-system

… and add the following:

    extensionProviders:
    - name: "my-custom-authz"
      envoyExtAuthzGrpc:
        service: "local-opa-grpc.local"
        port: "9191"

Your data object will look like something like this (assuming you're doing this against a fresh install of Istio):

data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      tracing:
        zipkin:
          address: zipkin.istio-system:9411
    enablePrometheusMerge: true
    extensionProviders:
    - name: "my-custom-authz"
      envoyExtAuthzGrpc:
        service: "local-opa-grpc.local"
        port: "9191"
    rootNamespace: null
    trustDomain: cluster.local
  meshNetworks: 'networks: {}'

With these last few changes, we've configured Istio to use the envoyExtAuthzGrpc extension provider, allowing us to direct requests over to OPA first for authorization (the default gRPC port for Envoy's OPA plugin is 9191).

OPA policy

We'll use a fairly simple OPA policy that will simply inspect the incoming request and determine if the XFCC header matches the expected value.

Istio generates an SVID (SPIFFE Verifiable Identity Document) which encodes a SPIFFE ID of the following format:

spiffe://<domain>/ns/<namespace>/sa/<serviceaccount>

domain: This corresponds to the trust root of the system. Since we installed Istio with no additional customization, this will be cluster.local. You can verify this by looking at the value for data.mesh.trustDomain in the Istio mesh config we edited above.

namespace: The namespace the application resides in.

serviceaccount: The name of the service account attached to the application.

The following OPA policy will inspect the x-forwarded-client-cert header and ensure it matches the expected value.

cat  > policy.rego <<EOF 
package envoy.authz 
import input.attributes.request.http as http_request 
default allow = false
allow {  
    svc_spiffe_id == "spiffe://cluster.local/ns/frontend/sa/frontend"  
}  
  
svc_spiffe_id = spiffe_id {  
    [_, _, uri_type_san] := split(http_request.headers["x-forwarded-client-cert"], ";")  
    [_, spiffe_id] := split(uri_type_san, "=")  
}
EOF

kubectl create secret generic opa-policy -n backend --from-file policy.rego

Deploy httpbin applications

The following configuration will deploy:

  • A service to expose the destination app
  • An Istio ServiceEntry to expose the local gRPC server
  • A service account that will be attached to the frontend app
  • A deployment for the source app, and a deployment for the destination app that includes an OPA sidecar
kubectl apply -f - <<EOF  
---
apiVersion: v1  
kind: Service  
metadata:  
  name: backend
  labels:  
    app: backend
    service: backend
  namespace: backend
spec:  
  ports:  
  - name: http  
    port: 8000  
    targetPort: 80  
  selector:  
    app: backend
---  
apiVersion: networking.istio.io/v1alpha3  
kind: ServiceEntry  
metadata:  
  name: local-opa-grpc  
spec:  
  hosts:  
  - "local-opa-grpc.local"  
  endpoints:  
  - address: "127.0.0.1"  
  ports:  
  - name: grpc  
    number: 9191  
    protocol: GRPC  
  resolution: STATIC  
---  
kind: ServiceAccount  
apiVersion: v1
metadata:  
  name: frontend
  namespace: frontend
EOF
kubectl apply -f - <<EOF  
---  
kind: Deployment  
apiVersion: apps/v1  
metadata:  
  name: frontend  
  labels:  
    app: frontend  
  namespace: frontend
spec:  
  replicas: 1  
  selector:  
    matchLabels:  
      app: frontend  
  template:  
    metadata:  
      labels:  
        app: frontend  
    spec:  
      containers:  
        - image: docker.io/kennethreitz/httpbin  
          imagePullPolicy: IfNotPresent  
          name: frontend  
          ports:  
          - containerPort: 80  
      serviceAccountName: frontend  
---  
kind: Deployment  
apiVersion: apps/v1  
metadata:  
  name: backend  
  labels:  
    app: backend  
  namespace: backend
spec:  
  replicas: 1  
  selector:  
    matchLabels:  
      app: backend  
  template:  
    metadata:  
      labels:  
        app: backend  
    spec:  
      containers:  
        - image: docker.io/kennethreitz/httpbin  
          imagePullPolicy: IfNotPresent  
          name: backend  
          ports:  
          - containerPort: 80  
        - name: opa  
          image: openpolicyagent/opa:latest-envoy  
          imagePullPolicy: Always  
          securityContext:  
            runAsUser: 1111  
          volumeMounts:  
          - readOnly: true  
            mountPath: /policy  
            name: opa-policy  
          args:  
          - "run"  
          - "--server"  
          - "--addr=localhost:8181"  
          - "--set=plugins.envoy_ext_authz_grpc.addr=:9191"  
          - "--set=plugins.envoy_ext_authz_grpc.query=data.envoy.authz.allow"  
          - "--set=decision_logs.console=true"  
          - "--ignore=.*"  
          - "/policy/policy.rego"  
      volumes:  
        - name: opa-policy  
          secret:  
            secretName: opa-policy  
EOF

Putting it all together

With everything deployed and configured, let's send a request from the source application to the destination application and inspect the decision logs in order to see if OPA allowed or denied the request.

Let's exec into the frontend pod, and fire off a get request using curl, ensuring the response brings back a 200:

kubectl exec <frontend_pod_name> -it -n frontend -- /bin/bash
root@frontend-7b864c8565-l75x9:/# curl http://backend.backend:8000/get -w "%{http_code}\n" -s -o /dev/null
200

To see what's happening under the hood, we'll inspect the OPA decision logs:

kubectl logs backend-5f576cf79f-2qpz9 -n backend opa
{"decision_id":"fa124d9b-56e5-4aa7-b779-206489a8b0d1","input":{"attributes":{"destination":{"address":{"socketAddress":{"address":"10.42.0.83","portValue":80}},"principal":"spiffe://cluster.local/ns/backend/sa/default"},"metadataContext":{},"request":{"http":{"headers":{":authority":"backend.backend:8000",":method":"GET",":path":"/get",":scheme":"http","accept":"*/*","user-agent":"curl/7.58.0","x-b3-sampled":"0","x-b3-spanid":"3b2c8fa7c963ce69","x-b3-traceid":"2334ef9a41e4a39e3b2c8fa7c963ce69","x-envoy-attempt-count":"1","x-envoy-peer-metadata":"ChwKDkFQUF9DT05UQUlORVJTEgoaCGZyb250ZW5kChoKCkNMVVNURVJfSUQSDBoKS3ViZXJuZXRlcwoZCg1JU1RJT19WRVJTSU9OEggaBjEuMTMuMwrKAQoGTEFCRUxTEr8BKrwBChEKA2FwcBIKGghmcm9udGVuZAohChFwb2QtdGVtcGxhdGUtaGFzaBIMGgo3Yjg2NGM4NTY1CiQKGXNlY3VyaXR5LmlzdGlvLmlvL3Rsc01vZGUSBxoFaXN0aW8KLQofc2VydmljZS5pc3Rpby5pby9jYW5vbmljYWwtbmFtZRIKGghmcm9udGVuZAovCiNzZXJ2aWNlLmlzdGlvLmlvL2Nhbm9uaWNhbC1yZXZpc2lvbhIIGgZsYXRlc3QKGgoHTUVTSF9JRBIPGg1jbHVzdGVyLmxvY2FsCiMKBE5BTUUSGxoZZnJvbnRlbmQtN2I4NjRjODU2NS1sNzV4OQoXCglOQU1FU1BBQ0USChoIZnJvbnRlbmQKTQoFT1dORVISRBpCa3ViZXJuZXRlczovL2FwaXMvYXBwcy92MS9uYW1lc3BhY2VzL2Zyb250ZW5kL2RlcGxveW1lbnRzL2Zyb250ZW5kChcKEVBMQVRGT1JNX01FVEFEQVRBEgIqAAobCg1XT1JLTE9BRF9OQU1FEgoaCGZyb250ZW5k","x-envoy-peer-metadata-id":"sidecar~10.42.0.79~frontend-7b864c8565-l75x9.frontend~frontend.svc.cluster.local","x-forwarded-client-cert":"By=spiffe://cluster.local/ns/backend/sa/default;Hash=d575dadf5f4561d2a52cabfb71cf6ed8099d4b282d1c12a9cc45b9dd568782c5;Subject=\"\";URI=spiffe://cluster.local/ns/frontend/sa/frontend","x-forwarded-proto":"http","x-request-id":"a872a5e4-672f-49f0-ae19-f978b77a3ebe"},"host":"backend.backend:8000","id":"18074164579871049980","method":"GET","path":"/get","protocol":"HTTP/1.1","scheme":"http"},"time":"2022-05-17T18:40:10.525394Z"},"source":{"address":{"socketAddress":{"address":"10.42.0.79","portValue":35662}},"principal":"spiffe://cluster.local/ns/frontend/sa/frontend"}},"parsed_body":null,"parsed_path":["get"],"parsed_query":{},"truncated_body":false,"version":{"encoding":"protojson","ext_authz":"v3"}},"labels":{"id":"bcaae62f-8d6c-44c4-a360-9fefb0cb8de3","version":"0.40.0-envoy-1"},"level":"info","metrics":{"timer_rego_query_eval_ns":250000,"timer_server_handler_ns":494000},"msg":"Decision Log","query":"data.envoy.authz.allow","result":true,"time":"2022-05-17T18:40:10Z","timestamp":"2022-05-17T18:40:10.526902Z","type":"openpolicyagent.org/decision_logs"}

Each decision comes in the form of a nested dict - we're particularly interested in a couple key points here:

  • The x-forwarded-client-cert:
"x-forwarded-client-cert":"By=spiffe://cluster.local/ns/backend/sa/default;Hash=d575dadf5f4561d2a52cabfb71cf6ed8099d4b282d1c12a9cc45b9dd568782c5;Subject=\"\";URI=spiffe://cluster.local/ns/frontend/sa/frontend"

Here, we can see that the URI field matches what we expected in the OPA policy.

  • The result of the decision
"result":true

Now, let's modify the OPA policy and see what happens when the SPIFFE ID does not match.

package envoy.authz
import input.attributes.request.http as http_request
default allow = false
allow {
    svc_spiffe_id == "spiffe://cluster.local/ns/foo/sa/bar"
}

svc_spiffe_id = spiffe_id {
    [_, _, _, uri_type_san] := split(http_request.headers["x-forwarded-client-cert"], ";")
    [_, spiffe_id] := split(uri_type_san, "=")
}

kubectl delete secret opa-policy -n backend                                                                                                                                                                
secret "opa-policy" deleted
kubectl create secret generic opa-policy -n backend --from-file policy.rego
kubectl delete pod <backend_pod_name> -n backend

Let's fire off the curl command again:

root@frontend-7b864c8565-l75x9:/# curl http://backend.backend:8000/get -w "%{http_code}\n" -s -o /dev/null
403

And take a look at the decision log:

kubectl logs backend-5f576cf79f-4dp9b -n backend opa   

{"addrs":["localhost:8181"],"diagnostic-addrs":[],"level":"info","msg":"Initializing server.","time":"2022-05-17T18:47:59Z"}
{"level":"info","msg":"Starting decision logger.","plugin":"decision_logs","time":"2022-05-17T18:47:59Z"}
{"addr":":9191","dry-run":false,"enable-reflection":false,"level":"info","msg":"Starting gRPC server.","path":"","query":"data.envoy.authz.allow","time":"2022-05-17T18:47:59Z"}
{"decision_id":"0603988b-cab6-40e1-a21e-2b172ca782ca","input":{"attributes":{"destination":{"address":{"socketAddress":{"address":"10.42.0.84","portValue":80}},"principal":"spiffe://cluster.local/ns/backend/sa/default"},"metadataContext":{},"request":{"http":{"headers":{":authority":"backend.backend:8000",":method":"GET",":path":"/get",":scheme":"http","accept":"*/*","user-agent":"curl/7.58.0","x-b3-sampled":"0","x-b3-spanid":"bd93fe3c7f52f0df","x-b3-traceid":"453e3cb475a902b3bd93fe3c7f52f0df","x-envoy-attempt-count":"1","x-envoy-peer-metadata":"ChwKDkFQUF9DT05UQUlORVJTEgoaCGZyb250ZW5kChoKCkNMVVNURVJfSUQSDBoKS3ViZXJuZXRlcwoZCg1JU1RJT19WRVJTSU9OEggaBjEuMTMuMwrKAQoGTEFCRUxTEr8BKrwBChEKA2FwcBIKGghmcm9udGVuZAohChFwb2QtdGVtcGxhdGUtaGFzaBIMGgo3Yjg2NGM4NTY1CiQKGXNlY3VyaXR5LmlzdGlvLmlvL3Rsc01vZGUSBxoFaXN0aW8KLQofc2VydmljZS5pc3Rpby5pby9jYW5vbmljYWwtbmFtZRIKGghmcm9udGVuZAovCiNzZXJ2aWNlLmlzdGlvLmlvL2Nhbm9uaWNhbC1yZXZpc2lvbhIIGgZsYXRlc3QKGgoHTUVTSF9JRBIPGg1jbHVzdGVyLmxvY2FsCiMKBE5BTUUSGxoZZnJvbnRlbmQtN2I4NjRjODU2NS1sNzV4OQoXCglOQU1FU1BBQ0USChoIZnJvbnRlbmQKTQoFT1dORVISRBpCa3ViZXJuZXRlczovL2FwaXMvYXBwcy92MS9uYW1lc3BhY2VzL2Zyb250ZW5kL2RlcGxveW1lbnRzL2Zyb250ZW5kChcKEVBMQVRGT1JNX01FVEFEQVRBEgIqAAobCg1XT1JLTE9BRF9OQU1FEgoaCGZyb250ZW5k","x-envoy-peer-metadata-id":"sidecar~10.42.0.79~frontend-7b864c8565-l75x9.frontend~frontend.svc.cluster.local","x-forwarded-client-cert":"By=spiffe://cluster.local/ns/backend/sa/default;Hash=d575dadf5f4561d2a52cabfb71cf6ed8099d4b282d1c12a9cc45b9dd568782c5;Subject=\"\";URI=spiffe://cluster.local/ns/frontend/sa/frontend","x-forwarded-proto":"http","x-request-id":"748cf9f9-927d-43db-9a8b-9b6439f78d9e"},"host":"backend.backend:8000","id":"3054161996610532426","method":"GET","path":"/get","protocol":"HTTP/1.1","scheme":"http"},"time":"2022-05-17T18:48:33.799689Z"},"source":{"address":{"socketAddress":{"address":"10.42.0.79","portValue":38782}},"principal":"spiffe://cluster.local/ns/frontend/sa/frontend"}},"parsed_body":null,"parsed_path":["get"],"parsed_query":{},"truncated_body":false,"version":{"encoding":"protojson","ext_authz":"v3"}},"labels":{"id":"3e581fc0-c4b3-4da7-8a68-a1d702e6b352","version":"0.40.0-envoy-1"},"level":"info","metrics":{"timer_rego_query_compile_ns":85000,"timer_rego_query_eval_ns":146000,"timer_server_handler_ns":822000},"msg":"Decision Log","query":"data.envoy.authz.allow","result":false,"time":"2022-05-17T18:48:33Z","timestamp":"2022-05-17T18:48:33.805277Z","type":"openpolicyagent.org/decision_logs"}

As you can see, this time the decision result was false - the SPIFFE ID coming in from our frontend workload did not match what we set in the OPA policy.

Conclusion

The exercise we've gone through is just one basic way to begin implementing better security measures within your applications. If you're interested in learning more about modern software security, a few great resources include:

CNCF Supply Chain Security Best Practices Paper

Solving the Bottom Turtle

Official Kubernetes Security Documentation