Using NATS to Implement Service Mesh Functionality, Part 2: Security

11 min readOct 22, 2019

NATS 2.0 Security with Operators, Accounts, and Users

Continuing the Service Mesh ideas and discussion in my first write-up, the next thing to look at with respect to NATS 2.0 and service mesh ideas to me is in the area of security. The security pieces of a service mesh involve end-to-end encryption (mutual TLS), authentication, authorization policies as well as service-to-service access control among the services. With NATS 2.0 and the introduction of NKeys, JWTs, and the Operator — Account — Users security model, I believe there is a great deal to use toward a more secure communication infrastructure using NATS. In this post we dive into that specifically and compare and contrast the NATS model with the security model of mainstream service meshes.

Service Mesh — Secure by Default, Defense in Depth, AuthN, AuthZ, and more!

The service mesh landscape has a boat load of buzzwords for their marchitecture for sure. I put them just above. However, they also have a lot of goodness baked into their tools and configuration you can turn on and off to secure your application services and their communication. So let’s get level set on service mesh security and then dive into some comparisons and contrasts with how NATS 2.0 security operates. This is not a deep dive on security. I have links throughout here for you to learn the specifics as your leisure.

I do have one pet peeve. The one thing that service meshes such as Istio preach is there are no changes needed for your application code or infrastructure to secure your services. They make it sounds easy and simple…where in fact, you do have to install Istio correctly. And more importantly you have to understand it and configure it correctly. There are a lot of things going on in Istio and it lets you perform a lot of things.

That complexity means tweaking configuration files (YAML) and commands to apply and make the configuration happen. There is a pretty big learning curve to do that. I am not trying to discourage you. Just be aware of what you are undertaking. Once done and implemented the security model for service meshes does work well. And it controls the security through its internal components list below.

It can get to be a PITA to have a bunch of YAML files around to go with setting up your project. And YAML can be hard to debug if your spacing is off. So that is a personal drawback. However, YAML files = Infrastructure as Code in my book so that is a plus.

As an example of this security setup, if you check out the Istio docs linked above for example, you can see it has a few running components for security:

Citadel for the key and certificate management
Sidecar Proxies to implement secure communication between the clients and the servers
Pilot to distribute the policies and secure naming
Mixer to manage authorization and auditing

Example Service Mesh setup for AuthN/AuthZ from https://istio.io/docs/conceps/security/

These components work together to ensure secure communication. Proxy sidecars (think the sidecars on motorcycles) work to communicate securely between each other next to your services. Your services talk through these sidecars as proxies out to and in from the other services. Clients make sure servers are valid. Servers make sure clients are valid and authorized based on policies. And they audit who did what at the specific point in time. I am at a 50,000 foot view on this of course and there are more intricacies to make sure this works.

The main point here is that Istio and tools like it handle security between the services internally with their setup. Your APIs and services need to know who to communicate with for your application. And you use YAML to configure Istio to do the secure service-to-service communications. I do encourage you to read the security documentation for Istio to get a better idea. Or listen to the many talks on YouTube by folks such as Christian Posta from Solo.io and the recorded sessions from KubeCon 2018.

NATS 2.0 — Operators, Accounts, and Users OH MY!

In comparison to the above service mesh model, the NATS 2.0 security model uses a more decentralized security model. For NATS with the 2.0 release you can use the memory based resolver (or NATS Account Server for larger deployments) and the nsc tool to setup Operators, Accounts, and Users. See the image below for a visual hierarchy and grouping concept. Operators in my opinion are the root authority or top level security construct. You first create an Operator and they are responsible for running the NATS servers, signing the account JSON Web Tokens (JWT) made for accounts. Accounts do a similar job in signing JWTs made for users (or message clients) in that account. All this enables the other pieces of security and AuthN/AuthZ within NATS.

Accounts are made and signed by the Operators and are how you achieve multi-tenancy in NATS 2.0. Think apartment building (multiple tenants) versus single family house. They are equivalent to namespaces in Kubernetes or containers and their application isolation to me. And then you have 1 or more Users that are mapped into accounts. The users can exchange messages with other users within the same account by default. You have to use services and streams (discussed later) to share information across accounts.

The NATS Docs for the nsc tool and Account Server go over this well. I spent a total of 3 hours reading and re-reading, taking notes, and trying the examples to fully digest the security model. Refer to that documentation for the implementation details. Also for a quick introduction on the NATS security model, Kevin Hoffman’s article on the subject has great information to read and digest as well.

To store all this security model information you can use a file structure created by the nsc tool, an in memory server (for smaller deployments with a more static account structure), or the new NATS Account Server to keep track of your user security. We discuss the account server just below.

NATS 2.0 Operator — Account — User Model for AuthN and AuthZ

There are pros and cons with all of this stuff. There is no perfect answer, just what works well for you and your team. I do believe it is good to have alternatives though! Which is why I am working to compare the secure communication of service mesh to what NATS 2.0 can do for you and your team.

Setting up Permissions and Policies

Once you have the Operator, Account(s) and User(s) setup you now decide on the user permissions and the policies allowed between users and accounts as well as between accounts. You specify the users and accounts using the nsc tool as described above. Once you have your layout of operator, accounts per operator, and users per account, you then specify what is allowed or denied with regard to those users in accounts.

NATS 2.0 also included the ideas of Streams and Services. In my head, streams are “things my account publishes that can go external to my account” in a pub/sub setup. And when I think services I think of “things that other accounts can request from my account that I will reply back to” in a request/reply setup. You can do these in public or private access.

Public access is what it is — you need to know what to subscribe to or what to request. The private access is more in line with the YAML configurations in service meshes where you limit what account can import your exported stream. Or what account can request/reply with another account inside your NATS message server. You also can further restrict the users created under the account to subscribe or publish only to certain subjects or wildcard scenarios.

This allows you to not only control messages inside accounts. It allows you to control user accounts (think client connections to NATS here) as to what they are allowed to do in order to access messages in other accounts. You can secure your message flow around accounts and users to segment the traffic in your applications.

By default user accounts have no limits on subjects they can subscribe to or publish to under their own account. Using the nsc tool you can also limit this and control the flow of messages to another level of detail even within an account. There are some nuances on the “_INBOX.>” subject used in request/reply message transfer you have to keep in mind in this. And in the latest release there is a permission that says “publish only to reply subjects”. Otherwise it is pretty straightforward in restricting messages. The examples in the NATS Docs help show these for you to see and understand. Definitely test and retest before deploying to production on all the ways message subjects should and should not be accepted.

NATS 2.0 Exporting and Importing across Accounts for Pub/Sub and Request/Reply

You do have to use a separate command line interface (CLI) with the nsc tool to do the accounts and users as well as the permissions for now. (Consolidation on tooling is on its way I am told!) And depending on what you are doing in your application(s), you need to know public keys of accounts to generate proper JSON Web Tokens. There is no JSON or YAML file that I am aware of to do this as of yet. That is another difference to how you setup policies in NATS compared to using service mesh designs. Again, not better or worse just different. And you need to know the differences to weigh what your situation, product, team, and environment can and should do.

One really cool feature on the account server for NATS as well as NATS itself is the fact that it stores no private keys, no user data, and no secret username/password combinations. I found that interesting and really at first, not believable. Now that I have started to understand the nsc tool and the account server, I see how they did it. And that was a cool design. Of course the DevSecOps automation-at-100% side of me has to figure out how to automate all of that into a dev/test setup still. But it was a nice surprise compared to hiding secrets with base64 encoding in Kubernetes or using Hashicorp Vault for secret management.

NATS 2.0 Security with the NATS Account Server

NATS 2.0 with the nsc tool allows you to create Operators, Accounts, and Users as a hierarchy of permissions to run with your NATS message servers. Using the in memory resolver for accounts and users (or NATS Account Server for larger deployments) with your NATS message server in combination with using TLS for encryption lets you ensure security between your messaging clients and the NATS server. It also allows authorization on who-can-do-what with regard to messaging. We will get into some details below. And remember: TLS encrypts the communication not the payload.

There are a few ways to run the NATS Account Server (NAS) type of setup. Look to the NATS Docs linked throughout this post to learn which to use. For my production I am going to run the memory resolver which can be reloaded without a server restart if there is a change. The NATS server(s) is started with this information so it knows who can do what as far as accounts and users. And if there are updates on accounts and users it can handle them fine.

Running NATS 2.0 with TLS and Certificates

Along with Operators and Accounts, the NATS Docs show great information for servers running with Enabled TLS clients connecting with TLS to allow encryption with references to client certificates and root certificate authorities (CA). You may need the CA file if you are using self-signed certificates or your own CA server. Note that you have to supply the certificate files for the server and the clients connecting to the NATS server.

This does not just come automagically into the infrastructure setup by adding ‘plumbing’ to the NATS infrastructure as you do with tools like Istio and Linkerd at least to get your AuthN. To me this is more like specifying the HTTPS into the initial service you want to talk to, versus the service-to-service mutual TLS (mTLS) that Istio and Linkerd can give you. So you will have to manage your certificates here.

Using TLS and User Credentials for NATS Client connections

When it comes to encryption with NATS, you use the certificates on all clients and the server specifically. Or per Colin @ Synadia, a simpler pattern that works for some environments is to use server side only TLS in the NATS server. Combine that with NATS 2.0 credentials when communicating with the servers. You must decide your requirements and design/develop/test/deploy accordingly.

The managing of the certificates is a key difference to me with tools such as Linkerd and Istio when comparing and contrasting with NATS 2.0. The service meshes make all communications go through the sidecar proxies that have the encryption pieces enabled through their infrastructure tool setup. So they manage the certificates. Whereas NATS you need to point the servers and clients to the certificates to use. So you have to manage the certificates.

Security for a Service Mesh with NATS 2.0

As you can see from all of this you have a way to secure NATS in similar ways you secure service communication inside your favorite service mesh. Both offer TLS for encryption, with NATS being the one you need to manage yourself for server and client certificates. And both constructs offer ways to limit services and calls between services. Service mesh design per the mainstream tools (Istio, Linkerd, etc.) use YAML configuration files to do this in general. NATS uses the Operator — Account — User model and their nsc tool to line up message subjects for publishing and request/reply across accounts as well as within the account by user.

This post is just an introduction to show you that there are ways that NATS 2.0 has security you can use to secure your communication paths. There are many links throughout the article and just below for you to educate yourself on NATS 2.0 security changes and the various service mesh software available as of October 2019. I am sure more service meshes will come out. Just make sure you read the docs, not just the marketing website and Twitter rants, so you know how to weigh options and decide on a direction.

Personally, I like the lighter lift of NATS where it can be used and it makes the most sense. But I have been using NATS for a couple years and understand the message model and the event driven application construct for it because I have used it. And as I have learned, not everyone does or will want to learn that development model. This information in here hopefully helps you weigh options on securing communications for your applications and gives you a few alternatives in producing secure software and safeguarding your communication and data flow.

Reference Links on NATS and Service Mesh Tech

Below are the links in general on the software tools I talked on. By the time you read this there may be even more service mesh ideas and software as that is the current state of our industry. Just make sure you read the docs to see the truth!