NashTech

Replacing legacy messaging layer systems with NashTech cloud solutions

Leading online delivery service

Introduction

NashTech helped to replace Amazon SNS and SQS messaging layer with an event-driven architecture based on NashTech Cloud solution and Apache Kafka.

 
 

Online delivery services for groceries and essentials depend on the ability to quickly and reliably complete seamless transactions. It starts with a customer selecting the items they want, placing an order, and proceeding through a shopper claiming that order, visiting the selected retailer, purchasing the items, and delivering them to the customer.

The challenge

Online delivery services for groceries and essentials depend on the ability to quickly and reliably complete seamless transactions. It starts with a customer selecting the items they want, placing an order, and proceeding through a shopper claiming that order, visiting the selected retailer, purchasing the items, and delivering them to the customer.

At one of the fastest growing delivery services in the U.S., these transactions are powered by an event-driven architecture based on Apache Kafka® and NashTech Cloud solution. In addition to handling near-real-time event streams for customer orders, this architecture supports a product data pipeline that handles multiple terabytes of data and up to 20 million messages per day. 

The event streaming infrastructure offers many crucial advantages over the legacy messaging layer it replaced at the company, which was based on Amazon Simple Notification Service (SNS) and Amazon Simple Queue Service (SQS). NashTech Cloud solution enabled the company to get started quickly, minimise operational overhead, and reduce engineering effort while providing a cloud vendor-agnostic solution.

Having replaced SNS and SQS messaging with Kafka and NashTech Cloud solutions, the company is seising the opportunity to optimise business processes and extend the benefits of event streaming to new initiatives. “We are already launching all new services we develop on NashTech Cloud solution,” says the engineer. “In addition, one of the big engineering wins we’ve seen is that we can now convert our in-line product data pipeline into a parallel architecture, which will reduce bottlenecks and get us closer to real-time operations.” 

The solution

After using SNS and SQS for several years, the company moved to Kafka and NashTech Cloud solutions to address several pain points with the Amazon services. A key factor was a company-wide push toward cloud-agnostic solutions driven by the CTO. Other drawbacks with SNS and SQS included message size limits and the lack of message retention or message compression, which required the company to expend significant engineering effort in developing in-house workarounds.

Replacing the legacy messaging layer with Kafka removed these limitations instantly. The engineering team now has the ability to configure the message size limit on a per topic basis and enable compression for any producer to reduce payload size and bandwidth consumption. 

The team completed a few test projects with the NashTech Cloud solution before choosing their first production project: replacing an existing service that consumes product data from retailers, enriches it, and then makes it available to other services. “On a typical day, this service processes up to 20 million messages, and we knew that if Kafka worked well for this use case, it should work well for other use cases we had in mind,” says the engineer. “It was also a safe place for experimentation because we could rebuild data stores from the original sources if we made any errors.” 

Throughout the initial setup and development efforts, the team met regularly with NashTech engineers to discuss best practices and operational details. “We had a helpful monthly sync with NashTech, where we discussed mirroring topics, monitoring clusters, or any other issues we were interested in,” says the engineer.

Following their success on the initial project, the team has since added new services to their event-driven architecture and has plans for a larger-scale refactoring of the pipeline. “Message retention with NashTech Cloud solution has been a nice win because when we stand up a new service, it can start consuming from an existing topic, and the data is already there,” says the engineer. “Refactoring the product pipeline by making use of log compaction and moving toward a parallel asynchronous model will be an even bigger win for us.” 

The team is also working on standard libraries and tools to speed further the adoption of event streaming across the company. This includes an infrastructure-as-code initiative enabling teams to create topics via Terraform with declarative configuration files. 

“With SNS we either had to write custom code for message compression or pay for the extra bandwidth that larger payloads required. With NashTech Cloud solution, we got that out of the box along with a more flexible and robust solution for interservice communication.”

Senior Software Engineer

The outcome 

Initial setup time reduced from months to minutes. “With NashTech Cloud solution, I had our Kafka cluster setup with simple event publishing in about an hour,” says the engineer. “If we had to stand everything up on our own, it would have taken us four to six months, in part because of the learning curve.”

Engineering efforts reduced. “With the SNS message size limit we faced in the past, we had to write code to split larger payloads, and that hurt efficiency,” says the engineer. “Similarly, lack of message compression required us to either write more code or pay for more bandwidth. With NashTech Cloud solution, we don’t have to worry about any of that because message sizes can be configured and compression enabled with simple configuration changes.”

Operational management minimised. “We don’t have anyone on staff dedicated 100% to managing our Kafka infrastructure, and that’s because NashTech host it,” says the engineer. “NashTech Cloud solution handles our high-volume traffic with reliability and availability—we’ve had no problems that impacted business since launch.”

New capabilities enabled. “Message retention in Kafka and NashTech Cloud solution enables us to process all the data in our core pipeline in parallel for enriching product information,” says the engineer. “Plus, if we find a bug in one of our consumers, we can fix it and replay earlier messages to resolve any issues introduced. And when we add new consumers to a topic, we have days of retained data to seed them with. None of that was possible before.”

Throughput doubled. “Our legacy messaging layer was handling 20,000 messages per second in the neighborhood,” says the engineer. “When we switched to NashTech Cloud solution, we more than doubled that rate without making any significant changes.”

“We started with pay as you go and had things up and running in about an hour. NashTech Cloud solution enabled us to defer the learning curve on running Kafka in-house. We currently don’t have the resources or bandwidth to host and manage clusters ourselves, so instead, we have the experts on Kafka doing that for us.”

Senior Software Engineer

Read more case studies

Modernising legacy systems and driving efficiencies through partnership with RPS Under Dev

Explore how NashTech help RPS modernise legacy systems and drive efficiencies through partnership

Modernising legacy systems and driving efficiencies through partnership with RPS

Explore how NashTech help RPS modernise legacy systems and drive efficiencies through partnership

Supporting digital shelf analytics and unlocking eCommerce growth

Explore how NashTech help the digital shelf analytics and unlock growth with a world leading data insights and eCommerce solutions provider.

Let's talk about your project

Our partnerships

Scroll to Top
PANEL DISCUSSION
Join the conversation and gain invaluable insights!
How Enterprises Can Leverage Cloud for Business Growth?
FREE WHITEPAPER
Unlock the power of knowledge with our new whitepaper
“Elevating User Experience for Product Owners”