Secure service-to-service communication with SPIFFE

In today’s interconnected world, where distributed systems and cloud-native architectures have become the industry standard, ensuring security and establishing trust between different components is of great importance.

The Cloud Native Computing Foundation (CNCF) has been at the forefront of developing open-source ecosystems that address the challenges faced by modern distributed systems. One of CNCF projects is SPIRE (the SPIFFE Runtime Environment), it is a toolchain of APIs that provides strong authentication and establishes trust between software systems across a wide variety of hosting platforms.

If you are wondering what exactly SPIFFE is, feel free to check out our last article to learn more about it!

Let’s get to know SPIRE

The first thing you are going to find if you google about SPIRE is the word SPIFFE because they are basically intertwined. Both SPIFFE and SPIRE are CNCF projects that help organizations build secure zero-trust environments. As we learned in our previous post, SPIFFE is essentially an open source standard that you can use for securely identifying all of your software systems. SPIRE, on the other hand, is an open-source implementation of the SPIFFE APIs.

It performs node and workload attestation in order to securely issue SVIDs (SPIFFE Verifiable Identity Documents) to workloads, and verify the SVIDs of other workloads, based on a predefined set of conditions. In other words, SPIFFE is the framework and SPIRE is an actual implementation.

SPIRE Architecture and Components

First we need to define what a workload is:

According to the SPIFFE documentation, a workload is a single piece of software, deployed with a particular configuration for a single purpose; it may comprise multiple running instances of software, all of which perform the same task. This can be a web server, an instance of a PostgreSQL database, a worker program or a web application composed of independently deployed microservices.

Now, let’s have a look at the two main components of a SPIRE deployment:

  • SPIRE Server:
    • Manages and issues all identities in its configured SPIFFE trust domain
    • Acts as a signing authority for identities issued to a set of workloads via agents
    • Keeps a registry of workload identities and the conditions that must be verified in order for those identities to be issued
    • It stores the signing keys, uses node attestation to authenticate agents’ identities automatically
    • Creates SVIDs for workloads when requested by an authenticated agent

  • SPIRE Agents (can be one or more)
    • Must be installed on each node where an identified workload is running
    • It requests SVIDs from the server and caches them until a workload requests its SVID
    • Exposes the SPIFFE workload API locally to workload on node and attests the identity of workloads that call it
    • Provides the identified workloads with their SVID

How does SPIRE work?

To know how SPIRE works, we need to understand the SVID Lifecycle (from Server Startup to Agent Attestation):

In order to bootstrap this environment we have to start by booting up the SPIRE server. The server then generates a self-signed certificate. If this is the first time that this particular server is starting up, it is going to generate a trust bundle.

Now when the server is bootstrapped, it is going to start a registration API, which can be called in order to register new workloads. After these steps, the SPIRE server is now ready to start receiving traffic which means we can start up the Agents. The Agent will then perform node attestation, meaning it will verify all properties of itself specific to the node that it is working on. If everything checks out, the SPIRE Server will issue the Agent’s identity to itself in the form of an SVID.

When an Agent has its own identity and is ready, it will start bootstrapping itself. The Agent will then call the SPIRE Server to obtain a list of registration entries that this particular node that this Agent is running on, is authorized to issue workload identities for. That connection is established over mutual TLS (both sides authenticate each other). 

The Server is able to authenticate the Agent using the SVID that was issued in the previous step, while the Agent authenticates the Server using the trust bundle it received in the previous step as well. After that mutual authentication, the SPIRE Server fetches all authorized registration entries from its data store that this Agent is authorized to issue and sends them back as a list of registration entries to the Agent. Then, the Agent sends the workload CSRs (Certificate Signing Request) to the Server for each entry. The Server will then sign each request and send it back. 

Now, the Agent has this list of workload SVIDs, which act as identity documents that each individual workload can use in order to prove its identity. Finally, the Agent starts listening on that workload API socket, waiting for agents to start calling that workload API to request an identity. And here is how SPIRE architecture looks like:

Conclusion

As distributed systems continue to evolve, the need for secure communication and trust between services becomes increasingly critical. CNCF SPIRE addresses these challenges by providing a powerful solution for workload attestation, identity management, and secure communication between the components. SPIRE enables organizations to build and deploy distributed systems with confidence, knowing that their workloads are properly attested and authenticated. Its dynamic and scalable architecture makes it a suitable choice for cloud-native environments, where workloads are continuously deployed and scaled both in and out.