19.3 C
New York

From Monitoring to Automation: Building Stable and Scalable Microservices

Aleksei Kish, lead developer at Semrush and an expert in managing microservice architectures for high-load applications, shared his insights on automating CI/CD processes and monitoring performance metrics. Aleksei detailed strategies to prevent API incompatibility issues, optimize request tracing, choose the right storage for various load profiles, and ensure infrastructure reproducibility.

Aleksei, could you tell us about the projects where you’ve implemented microservice architecture and the results you achieved?

My most recent project utilizing a microservice architecture is EyeOn, a competitor monitoring tool by Semrush. The solution enables users to create projects with lists of competitor domains and track their activity, including social media and blog posts, website changes, new pages, and advertising campaigns.

I opted for a microservice architecture based on the product’s requirements: data collection and processing, web interface display, and notifications via email and Slack. The architecture comprised about 30 services, including an API Gateway, project services, archives, notifications, and others.

The key advantage of this approach is the clear division of responsibility among services and well-designed interactions between them. This structure provides several benefits:

  • Elimination of development blockers. Each service can be deployed independently without requiring coordination with other teams.
  • Scalability. For example, the archive service, used for both the UI and notifications, can scale independently without impacting other parts of the system.
  • Resilience. The product remains operational even if some services, such as data collectors or notification modules, go offline.

As a result, the project demonstrates high performance, stability, and readiness for user base growth, all while minimizing costs and providing convenience for both developers and users.

How Do You Approach Designing Microservice Architecture, and When Is It Preferable to a Monolith?

I usually begin designing microservice architecture by analyzing business entities and their interactions. If operations on entities can be split into independent tasks with varying execution times, microservices are the better choice. For instance, when working with reports, a monolith might suffice initially. However, as the system grows, splitting functionality into report generators, storage, and viewers becomes necessary. This separation simplifies adding new report types and scaling operations.

A monolithic solution is suitable for a quick start since it’s simpler to develop, deploy, and manage. However, as the system grows and new use cases emerge, a monolith starts to hinder the speed of updates and complicates workflows. Microservice architecture allows for a clear separation of responsibilities, making development more flexible and scalable.

Microservices are particularly useful for large projects with multiple teams. Dividing functionality into services enables each team to work autonomously, coordinating only through well-defined contracts. This reduces interdependencies, minimizes development conflicts, and accelerates the release of new features.

When choosing between a monolith and microservices, I consider the current requirements, the growth rate of the system, the team’s scale, and long-term development plans.

What Strategies Help You Ensure Microservices’ Independence Within a System?

The primary strategy is the clear separation of responsibilities. Each service is responsible for its task and interacts with others through predefined contracts. These contracts define the input and output parameters, while the service’s internal workings remain a “black box” to the rest of the system. For example, a service might receive a report generation task via a message queue, execute it, and send the result to the report storage without relying on other systems.

To facilitate communication between microservices, I use standard mechanisms such as HTTP requests (REST, gRPC) or asynchronous message queues. This minimizes dependencies, speeds up development, and simplifies scaling.

Designing microservices resembles the process of developing standards in other fields. First, a theoretical model is created (the contract description), and then teams implement it while adhering to the agreed-upon rules. If all participants follow the standards, the system becomes robust and adaptable for future enhancements.

Thus, the independence of microservices is achieved through the strict isolation of functionality, well-defined contracts, and standardized interaction methods.

How Do You Maintain Data Consistency Between Microservices?

Again, I always emphasize the clear separation of responsibilities and strive to minimize cases where services need to share a common context. If such a need arises, it’s often a signal that the architecture should be reconsidered. Each service should be stateless, meaning it doesn’t retain information about previous requests. Instead, it requests the necessary data from another service or a data store, avoiding the storage of shared context.

In rare cases where data from different sources must be combined, the process is designed to eliminate direct dependencies between services. For example, two services may collect different types of data and store them in a shared data store. A third service then combines the data and provides a unified result. However, even in such scenarios, data is centralized in a single source to avoid overlapping responsibilities.

At the infrastructure level, it’s crucial to consider the load on data stores, as scaling microservices can lead to database performance issues. If the system can handle high service-level traffic but the databases cannot, solutions like sharding or optimization become necessary. For instance, data can be distributed across multiple instances based on predefined rules, such as using the remainder of an ID division to determine data placement.

What Challenges Do You Face When Automating CI/CD for Microservice Architecture, and How Do You Solve Them?

The main challenge in this area is managing releases that break API backward compatibility. For instance, when deploying a new API, it’s essential to ensure that dependent services do not fail.

This problem is addressed with a phased approach. First, the new version of the service is deployed alongside the old one. Then, dependent services are updated to support the new API. After verifying that everything works correctly, traffic is gradually shifted to the new version, and the old version is decommissioned.

If the update requires data migration, the process becomes more complex, but the general principle remains: duplicate services and switch traffic incrementally.

Modern tools like Kubernetes significantly simplify version management and scaling. However, the core task remains to prevent disruptions in the interaction between microservices. This involves managing API versions carefully. For major changes, a new version of the API should be introduced (e.g. adding a prefix like V3 or V4). This ensures that dependent services have access to both versions of the API until the migration is complete.

Which Metrics Are Most Useful for Monitoring Microservices and Their Performance?

In my opinion, the key metrics include request tracing, logging, and the standard RED metrics (Request rate, Error rate, and Duration).

Request tracing allows you to track the entire lifecycle of a request across the system, from its source to execution. This helps identify which service initiated the request, how it propagated through other services, and where issues occurred. For example, tracing can show that a user request via the gateway triggered a chain of calls to various services, pinpointing the exact location of a failure.

Logging complements tracing by providing detailed information about actions performed by services and aiding in the analysis of errors or abnormal behavior. Logs are particularly useful if they contain meaningful data, such as request details or execution context.

RED metrics provide a foundational view of each service’s performance:

  • Request rate: The frequency of requests handled.
  • Error rate: The proportion of failed requests.
  • Duration (latency): The time taken to process requests.

These approaches help maintain transparency in microservices operations while enabling rapid identification and resolution of issues. Modern tools like Prometheus and Grafana seamlessly integrate tracing and monitoring into the system, greatly simplifying management.

How Do You Organize Efficient Data Storage in High-Load Microservice Architectures?

Each service should have its own dedicated data store, tailored to its specific load profile. For instance, a report-generation service may use a low-performance database because its operations involve infrequent reads and writes. Meanwhile, an archival service faces heavy workloads due to high volumes of writes, reads, and additional tasks like data exports or analytics.

For such high-load scenarios, choosing the right solutions is critical. Data can be split into two layers: metadata about reports can be stored in a relational database, while large files can be kept in a specialized document store. This approach reduces the load on the primary database and simplifies scaling.

The key factor is analyzing the load profile of each service and anticipating peak data volumes to proactively adapt the infrastructure to future challenges.

How Does Infrastructure as Code (IaC) Help You Manage and Scale Microservices?

Infrastructure as Code (IaC) significantly simplifies managing and scaling microservices by enabling automation and reproducibility. It allows infrastructure to be rapidly adapted to new requirements—for example, adding database replicas or increasing server capacity—by simply modifying variables in configuration files such as those used with Terraform.

One of IaC’s major advantages is transparency: the entire infrastructure is described in code, making it easier for new team members to understand and onboard. Even if the team undergoes a complete overhaul, new members can quickly familiarize themselves with the repository to understand which resources are in use and how they’re configured.

Additionally, IaC standardizes processes and minimizes errors from manual configuration. Instead of creating resources manually through a provider interface—which can be inconsistent—the system automatically deploys infrastructure according to predefined configurations. This applies to a wide range of resources, including databases, secret managers, virtual machines, and more.

Scaling and modifying resources are equally straightforward. For example, increasing subscription capacity or disk space requires only a configuration update, which can then be applied with a single command. The system implements the changes automatically. IaC not only facilitates the deployment of new resources but also supports updating existing ones while managing access and versioning via tools like Google Secret Manager.

This approach makes infrastructure management transparent, repeatable, and easily scalable—essential qualities for handling complex microservice architectures.

Subscribe

Related articles

About Author
Christy Alex
Christy Alex
Christy Alex is a Content Strategist at Alltech Magazine. He grew up watching football, MMA, and basketball and has always tried to stay up-to-date on the latest sports trends. He hopes one day to start a sports tech magazine.