Scheduling in the Time of Event Driven Architecture
Scheduling and batch processing are as relevant as they were decades ago. Use cases such as payroll processing, invoicing etc., employ batch processing. With the recent noise on event driven architecture and streaming a disturbing trend is the use of the events to handle scheduled batches. Most architectures today seem to talk less or not at all about schedulers or batch processes. Some even consider them as an anti-pattern. As a result, when providing a solution, an event driven approach is taken instead of a scheduled or time-based one. Let us consider a use case where invoices are to be generated daily. Input files contain details of orders, customers and rates. The files are to be fetched and sent to another system in sequence with order files first, followed by customer data and rate information. I was tasked to review a solution defined for thus use case. The solution was apparently based on real-time and event driven architecture had the following steps:
- Define an event-driven trigger that fires when a file is placed in the directory
- The trigger starts a program that starts reading the files
- It checks the order of files for the sequence i.e., if the file being read first is the order file followed by customer file and rate file
- The first file is then sent to the destination followed by other files
Contrast this with a time/schedule based approach:
- Define a time-driven trigger that fires at a specified time
- The trigger starts a program that starts reading the files
- The checking of the sequence is not mandatory here as the files are read at a specified time and not as and when they were placed on the directory. The possibility of fetching or processing files out of sequence is reduced
- The files are sent to the destination with lesser processing as order checking is not required
For a more robust sequencing process, a scheduler such as Control M could be used instead of a time trigger as it has capabilities to check the status of each of the file loads in the target system and only send the next file on the success of it.
Organizations will continue to have a mix of batch, event driven and synchronous real-time processes. The advent of digital technologies does not mean that everything would become on demand. The architecture building blocks need to have a place for tools and techniques that support all requirements to enable building solutions that are not brittle.
Cross posted at https://www.linkedin.com/in/daneshzaki/