CNCF: What’s Incubating — Part 1: NATS Messaging

Image for post


The Cloud Native Computing Foundation is the home for Cloud Native projects which seats right under the Linux foundation.

It hosts very well known projects (and for some their CI/CD) like Kubernetes, Prometheus, Envoy, Fluentd and more. Well, these are the graduated ones. Graduated projects are projects that are passed their major adoption levels and been voted for and considered very stable.

Project creators and communities has high incentives to host their project under the CNCF.

“In a world where GitHub use is ubiquitous, it is no longer sufficient for a software foundation to offer just a software repo, mailing list, and website. An enhanced set of services is required as it facilitates increased adoption.”

CNCF provides the projects hosted by it with many services and incentives as money investments, community tooling and foundations, program management, event management, marketing services, certifications, legal services and more.

As part of our work at CloudZone, a multi cloud premier consulting partner and CNCF Silver Member and a CNCF Kubernetes Certified Service Provider, we include many of the CNCF tools and frameworks in our solutions.

This series of blog posts is intended to summarize in detail the CNCF hosted projects that are currently in the incubation stage (after sandboxing, before graduating).


NATS Messaging

NATS is a lightweight cloud native messaging system for next generation cloud native distributed applications, edge computing platforms and devices. It is already an 10-years old and is production proven (Originally Built to power CloudFoundry)

NATS is:

  • Highly Resilient
  • Highly Secured
  • Highly scalable with built in load balancing and auto scaling
  • Extremely lightweight
  • frictionless for developers in an agile environment
  • Provides QoS for messaging
  • Support for 30 client SDKs

NATS is focused on performance, security, simplicity , and availability. Its community has grown in the last 2 years.

  • Cloud Messaging: Microservices Transport, Control Plane, Service Discovery and Event sourcing
  • IoT and Edge
  • Mobile
  • High Fan-out messaging
  • Augmenting or replacing legacy messaging systems

Core Entity: Subject: You can think of it as a Topic to which any client can subscribe. It is a string representing an interest in data and subscribers can subscribe to subjects using wildcards that matches these subject strings.

Pub/Sub: publish a message to a subject. Subscribers on this subject will receive this message. Used for high fanout and parallelization of work.

Image for post

Load balanced queue subscribers: When you create subscribers you can add them to a group. When a message is published, one of them will receive it. You can think of it as a Kafka Consumer Group. Used for load balancing, auto-scaling, lame duck mode during upgrades.

Image for post

Request/Reply: Unique subjects that enables request/reply patterns where you can send request to many subscriber and receive only 1 response from the NATS cluster thus getting the fastest response with least latency. In this case NATS will prevent the messaging from continuously propagating after a response is been sent. Used for request to many and handling first response where you can scale the subscribers to achieve faster response.

Image for post

Clients can invoke graceful shutdown using the drain API where the client unsubscribes and stop receiving new messages but continue to process buffered ones. Used to prevent data loss in shutting down or scale-down events and in application upgrades.

NATS prioritises the health of the system as a whole over individual client or server. For example, when a NATS client is not consuming fast enough the NATS server will cut this client off, considering it as not healthy. The NATS cluster is a full mesh of NATS server where any of them can go down and the rest will take over its clients. Also, connections of servers and clients are self healing meaning they will try to reconnect. NATS was proved to be very stable and running under load for long duration of times without any interruption.

Extremely fast: 18 million messages per second with single server and single data stream and up to 80 million per second with multiple streams. It is very scalable.

NATS is single binary proved with 8MB docker image. the text-protocol payloads are binary. Configuration is not more than a url and credentials. Servers are auto-discoverable and share discovered topology and configuration is shared — which is all transparent to the clients. The client APIs are simple and straight forward.

NATS has a prometheus exporter for exposing metrics and a Fluentd plugin.

NATS support two delivery modes:

  • At most once (Core) where there is no guarantee of message delivery. in this case application must detect and handle lost messages
  • At least once (Streaming) where a message will always be delivered and in certain cases more than once.

NATS by choice does not provides Exactly once mode as it is considered unnecessary and is slow and complex which does not stands together with one of the core missions of NATS where it focuses on simplicity and high performance.

Streaming also includes features as replaying by time or sequence number, rate matching per subscriber, storage tiers (memory, file ,database), HA and scale through partitioning.

Image for post

Multi-tenancy is possible in NATS via accounts and enriched with Services and Streams.

Accounts — Isolated communication contexts with which you can use a single NATS deployment for multiple isolated operators. Accounts can share data with other accounts via Services and Streams.

Service — RPC endpoints that enables a Request/Reply delivery pattern between accounts — one on one conversation. Used for: monitoring probes, certificate generation services, secure vault.

Streams — Enables publish/subscribe pattern between Accounts. Used for: Global Alerts, Slack like solutions, Twitter feeds.

For example, given two clients that publish and subscribe on Subject X in account A , no Subscriber of Subject X in account B will receive this message unless there is a Stream allowing it. The great things is that for all of the above no client configuration is needed.

NATS is secured with:

  • Authentication via TLS certificates,, basic credentials & NKeys (based on ED25519) and JWTs
  • Encryption with TLS
  • Policy
  • Allow/Deny based Subject authorization with wildcard support

All updated to these entities are with zero down-time.

Clusters of NATS clusters can be used for global implementation of NATS messaging. In its core you can use a global distributed load balanced queue subscribers which is geo aware. For example, given a queued subscribers in Europe and in the US, if i publish a message from Europe, NATS will deliver the message to one of the subscribers in Europe.

  • Integration with more messaging projects like Kafka
  • Native MQTT Support & Microcontroller clients for IoT
  • WebSocket Support

Integration via the NATS Operator which creates and manages NATS cluster. The Operator uses K8s RBAC for authorization with service accounts. Provides Hot reload of Secrets stored configuration.

An Helm chart is available for installing on Kubernetes

Image for post