Why Data Replication Should Not Be Done Using ESB-Based Integration Tools

August 13, 2019 Rajeev Gupta No comments yet

This is one of the common questions we get when prospects come looking for data replication tools. It’s more a question of Integration design patterns than of product implements.

Let’s get started with what an ESB is – Enterprise Service Bus. This is an integration design pattern where messages are passed so that one or more Message Listeners can listen and consume the message – store and forward. These messages—like, say, emails—have a header (from and to), a payload (the message), and perhaps attachments. Based on the ESB, there might be some limitation on payload and attachments sizes.

The Flow is like This:

App produces message -> ESB receives message (in a queue) -> Based on Routing rules, ESB routes message -> Listener Consumes Message -> Likely maps/translates data -> Saves / forwards to another app/queue -> confirm message is received -> ESB tags and stores message as processed.

Notice the ones in “bold”. These are places where data flow can “choke” or “build up” if there is a high flow of large data sets.

Now look at Data Replication: you have a source of data, be it a database (common) or a Cloud Application (like Salesforce). In data replication, you would require a complete backup of both schema and data changes. The application is expected to identify schema changes and update to target (without the need for remapping), so interpreting schema changes and having the ability to adjust target schema changes becomes important. The ability to process a large number of rows is necessary. One of the common ways that most databases replicate is using their transactional logs (when you look under the hood of master–slave replication). When you have disparate applications like Salesforce and Oracle, then you have to rely on query-> extract -> interpret change -> check for target source duplicate -> load on another system.

Ok, so let’s now look at why ESB-based apps might not be the right choice:

ESB requires store and forward, which might not be necessary for data replication. While you can debate that it will work (yes you can make it work), it will be slow and overly complicated.
ESB in general is considered to have the higher overhead of operation management and requires higher uptime as it’s mostly used for distributed app integration. Replication usually is run on batch (or scheduled time) or, in the case of master–slave, a lot more real time than what ESBs are designed for.
Managing schema changes often requires ESBs to remap some of the message flows. Some of our clients really dislike this, in that not only do they have to track source and target schemas, but also often trigger a “Change Management” request up the IT chain, which can take weeks or months to get over. Data Replication Software usually automatically adjust target schemas.

When you look at the Integration Tools market, the industry has segmented itself, with one group going the ESB or Message queue route (which is slowly evolving into API-based integration) and that of data replication.

So let us see some of the common integration apps and how do they fit in:

MuleSoft- A leader in ESB-based integration and does quite well in Service-oriented architecture and does well in integrating apps like SAP and others. They are also introducing API management.

Kafka- Open-source Messaging platform, very popular in high-volume messaging, especially with IoT and big data. It requires smaller messages size.

GoldenGate (by Oracle) – a leader in data replication between different databases. Does not yet have Cloud application data replication.

DBSync – Cloud Data Replication uses direct replication technique while iPaaS Cloud Workflow is more a store and forward.

There are many more; perhaps a good place to look is Gartner’s Data Integration and Gartner’s Integration-as-a-Service magic Quadrants to see which are leading the pack.

We would be delighted to discuss your use case and explore how DBSync can support your success. Please feel free to Schedule a meeting with us.

FAQs

What are the key differences between ESB-based integration and dedicated data replication tools?

ESB-based tools focus on message passing and service orchestration, while dedicated data replication tools specialize in efficient, high-volume data synchronization with schema management.

Why are ESB tools generally unsuitable for large-scale data replication?

ESBs introduce overhead with store-and-forward mechanisms, which can slow down data transfer and become overly complex for large data sets, impacting performance.

How do schema changes affect data replication using ESB tools compared to dedicated replication tools?

ESBs often require manual remapping for schema changes, leading to delays and IT change management processes, whereas dedicated tools typically automate schema adjustments.

What are the performance and operational management differences between ESB and data replication tools?

Data replication tools are optimized for batch or real-time data movement with lower overhead, while ESBs are designed for distributed application integration with higher uptime requirements.

What are examples of integration tools and their specific use cases in data replication and integration?

Tools like GoldenGate specialize in database replication, while DBSync offers both direct replication and iPaaS cloud workflow solutions; ESBs like MuleSoft are better suited for service-oriented architecture and API management.

Rajeev Gupta

Rajeev has extensive experience with application architecture and on-demand computing. He is also serial entrepreneur has over 27+ years in the tech industry and has worked with Fortune 100 companies like GE, HCA and McGraw Hill Digital Learning.

Application Integration for Business Users

Cloud Workflow

Data Replication for IT users & Database Admins

Cloud Replication

Want to receive news and updates?

Products

Top Integrations

Resources

Support

Company

Partners

Application Integration for Business Users

Cloud Workflow

Data Replication for IT users & Database Admins

Cloud Replication

Why Data Replication Should Not Be Done Using ESB-Based Integration Tools

What are the key differences between ESB-based integration and dedicated data replication tools?

Why are ESB tools generally unsuitable for large-scale data replication?

How do schema changes affect data replication using ESB tools compared to dedicated replication tools?

What are the performance and operational management differences between ESB and data replication tools?

What are examples of integration tools and their specific use cases in data replication and integration?

Rajeev Gupta

Leave a Reply Cancel reply

Related posts

Comparing ETL vs ELT: Designing for Accuracy or Agility?

What is Cloud Data Replication? A Comprehensive Explanation

10 application integration trends you should follow

Want to receive news and updates?

Products

Top Integrations

Resources

Support

Company

Partners

One Step Away...

No result found.