CLOSE

Agile devops & continuous delivery

Service Meshes

Definition of Service Mesh

In (typically container-based [1]) microservices architectures a service mesh provides software components and mechanisms to separate cross-cutting concerns of service-to-service communications from the business logic of the individual microservices into a so-called “control plane”. This segregation simplifies development, operations, and management of larger microservices environments. It is achieved by making every microservice communicate with another one (the so-called east/west traffic) over dedicated components called proxies.

Today, these proxies are implemented mostly either (i) directly in the microservice code itself (e.g., in form of suitable libraries) or (ii) are deployed as a separate software component next to but independent from the microservice (notably not depending on the programming language it is written in) in the same container as a so-called “side car”.

In addition to the control plane, where messages regarding cross-cutting concerns are exchanged (e.g., microservice AddUser no longer reachable), service meshes also recognize a so-called “data plane” which carries the business data to and from the respective microservices (e.g., Add User Homer Simpson or Get contents of shopping cart 4652201). A central (or possibly federated) «Control Plane Master» [2] coordinates control plane messages from and to all the different microservices proxies.

Generic service mesh architecture separating data and control plane

Note that proxies participate in both, data plane and control plane, whereas the “raw” microservices are only involved in data plane communications.

Cross-cutting concerns addressed in service meshes available to date include:

  • Configuration & deployment
  • Service identity & discovery
  • Observability (= monitoring)
  • Resilience
  • Security
  • Traffic management
  • Policy enforcement

Technology Evaluation

Shifting the management of cross-cutting concerns into a distinct control plane is an application of the architectural principle of separation of concerns. The “control plane” pattern stems from telecommunications protocols where it has worked and still works very well.

Historically, initial implementations of the proxies realizing the control plane have been through libraries (e.g., Netflix OSS). Because of the limitations of this approach (e.g., programming language dependent, microservices bloat because developers have to duplicate the libraries in every microservice, increased operational complexity) current service mesh implementations favor the side car deployment model for the proxies.

However, it is also possible to implement an intermediary architecture using so-called node agents (e.g., Consul) or microgateways. These components also implement a control plane for east/west microservices communications but typically act as a proxy for several microservices or also classical (larger) applications - contrary to a side car which handles exactly one microservice. Whereas node agents focus on control plane functionality, microgateways originate from API gateways and also provide associated API management functionalities in addition to control plane functions.

This indicates that there is a continuum between very fine-grained service meshes and a limited number of internal microgateways and API management facilities.

Limitations The biggest limitations of (current) service mesh implementations are

  • Significantly increased latencies (highly depending on the individual setup, but factors of 2—10 have been observed!)
  • Significantly increased resource consumption (CPU and RAM utilization)
  • Additional complexity of the control plane elements (e.g., proxies), especially for large microservices environments, which have to be managed in addition to the raw microservices. This includes knowledge and skill gaps for the teams implementing and managing service meshes which need to be overcome.
  • Different sets of features and functionalities supported by various service mesh implementation render the selection process non-trivial (e.g., non-HTTP protocol support like WebSocket, gRPC, authorization, circuit-breaker pattern, tracing)

Market - Current Adoption

Real (earnest) service mesh adoption in production environments (mostly K8S) is in its infancy (approximately 10-15%) with a third of organizations doing various testing, and another third planning to do so. Roughly 10-15% don’t have any intentions to engage with service meshes at all. The most prominent service meshes are:

A very good comparison of these implementations may be found at servicemesh.es.

Market - Outlook

Short-term adoption rates will remain low due to the complexities associated with introducing and operating service meshes, especially for larger microservices environments (e.g., where organizations have to manage hundreds of proxies).

In a mid to long-term perspective, service mesh offerings will considerably mature. The separation of concerns implemented through service meshes will, in general, make them the preferred choice to successfully develop and manage larger scale microservices deployments.

In hybrid IT landscapes comprising microservices and classical (monolithic) applications, architectures will most likely become more complex also including microgateways or internal API gateways in addition to microservices proxies.