Real-time vs Batch Integration
When to use real-time integration and when batch processing is the better choice. A practical guide to choosing the right integration approach.
When to use real-time integration and when batch processing is the better choice. A practical guide to choosing the right integration approach.
Data moves between systems immediately (or within seconds) of an event occurring. When a customer places an order on your website, your warehouse system knows about it within moments.
True real-time typically uses webhooks or event streaming - the source system pushes data as events happen rather than waiting to be asked.
Data syncs frequently - every few minutes - but not instantaneously. Often implemented by polling (checking for changes on a schedule) rather than push-based events.
Data accumulates and moves in scheduled batches - hourly, nightly, or weekly. All changes since the last batch are processed together.
E-commerce inventory: When stock is low, real-time sync prevents overselling. A customer shouldn't be able to buy the last unit while another customer's order is in a batch queue.
Fraud detection: Checking transactions against fraud rules must happen before the transaction completes - batching is too late.
Customer support context: When a customer calls, the agent needs to see their recent orders immediately, not from last night's batch.
Payroll processing: Salary calculations need complete data. Processing in real-time would mean incomplete calculations as timesheet entries arrive throughout the period.
Data warehouse loading: Analytics queries run on yesterday's complete data, not constantly changing current data.
Month-end reporting: Financial close processes need all transactions finalised before processing - real-time would create moving targets.
| Factor | Real-time | Batch |
|---|---|---|
| Data freshness | Seconds/minutes | Hours/days |
| Complexity | Higher - event handling, error recovery | Lower - straightforward ETL |
| Infrastructure cost | Higher - always-on, scalable | Lower - runs periodically |
| Error handling | Must handle immediately | Can review before retry |
| System coupling | Tighter (with sync calls) | Looser (files/staging) |
| Testing difficulty | Higher - timing/sequencing issues | Lower - deterministic |
Most real-world integrations use a mix. You might process orders in real-time but sync product catalog changes in nightly batches. Some patterns that combine approaches:
A real-time event starts a batch process. An "end of day" event triggers nightly processing. A "file received" event starts a batch import.
Very frequent small batches - every 5 minutes - provide near real-time freshness with batch-style processing. Simpler than full event streaming, fresher than traditional batches.
Prioritise real-time for the 20% of data that matters most. Order status updates might be real-time while customer preference changes sync overnight.
Principle: Start with batch unless there's a clear business reason for real-time. It's easier to add real-time capabilities later than to simplify an overly complex real-time architecture.
Real-time integration sounds impressive, but batch processing often provides the best balance of simplicity, cost, and reliability. The right choice depends on your specific requirements - how fresh data needs to be, what systems can support, and what complexity your team can manage.
Don't default to real-time because it seems modern. Default to the simplest approach that meets your actual requirements.
Tell us what you're working on. We'll come back with a practical recommendation and clear next steps.