High Available Jaeger Tracing Collectors in Oracle Cloud
--
when embarking on a journey to build cloud native applications and/or decomposing a monolithic application into microservices one of the issues you will encounter at one point in time is the issue of getting a good insight in what the system is actually doing. The benefit of monolithic applications is that logging and monitoring can be done in one place, or at least a limited number of places.
Building a distributed solution, holding multiple services in a multitude of containers who are distributed over multiple cloud regions provide the issue that a single user interaction will trigger actions on multiple environments. When invoking an action on a monolithic application all logging will be on that specific server, doing the same the logging showcasing the entire chain of events will be distributed over all kinds of different systems (containers / serverless functions / virtual machines).
To help you tackle this issue Jaeger provides a part of this solution. Jaeger, an open source, end-to-end distributed tracing solution to Monitor and troubleshoot transactions in complex distributed systems. As on-the-ground microservice practitioners are quickly realizing, the majority of operational problems that arise when moving to a distributed architecture are ultimately grounded in two areas: networking and observability. It is simply an orders of magnitude larger problem to network and debug a set of intertwined distributed services versus a single monolithic application.
Jaeger in production
As with every production deployment, one of the key requirements will be fault tolerance and high availability. The standard architecture for Jaeger, as shown below, provides all the key options to build a high available and fault tolerant solution.
When we follow the high availability guidelines no component should ever be a single point of failure. This holds that a large set of the standard deployment best practices we have been applying in on-premise deployments will also hold in a cloud / cloud-native deployment. Even though cloud infrastructure and cloud native services…