Apply parallel processing algorithms
You can use different approaches to batch processing by using parallel processing algorithms. Developers typically use three different types of parallel processing in their designs:
Individual task modeling - Creates a separate batch task for each work item. This approach works well with a small number of work items, and it is best at creating dependencies between work items compared to the other parallel processes. It is also simpler to develop compared to the other parallel processes. The individual task modeling approach is not ideal for many tasks because the extra overhead will affect performance and cause delays between batch tasks that are running.
Batch bundling - Uses a limited number of batch tasks that have a bundle size value (the number of work items in each batch task). This approach works well for a simple, even workload where processing times are similar. In this case, the batch table is not overly populated by batch tasks. Performance can be increased if the bundle size is appropriate. A bundle size that is too large or too small can experience performance issues. Uneven workloads can decrease performance with this approach because batch tasks will finish processing at different times.
Top picking - Creates a limited number of batch tasks where each batch task picks and processes the first free work item. This process requires a staging table to store work items for batch tasks to process from. This approach works well if the workload is uneven with varying processing times. The batch table is also not overpopulated with batch tasks that are using this approach. If several small work items need to be processed, tracking the work items in the staging table can affect performance.
Some key considerations for selecting an approach are:
- Workload characteristics
- Select individual task modeling for small, interdependent workloads.
- Opt for batch bundling for uniform tasks with predictable processing times.
- Use top picking for uneven workloads requiring flexibility.
- Performance Optimization
- Minimize overhead by avoiding unnecessary batch task creation.
- Experiment with bundle sizes in batch bundling to find an optimal balance.
- Implementation Complexity
- Start with individual task modeling for simple requirements.
- Transition to batch bundling or top picking as workload complexity increases.
Example Scenario
Consider the problem that a company processes sales orders daily. Small orders (under USD 1000) take a few seconds to validate, while large orders (above USD 10,000) require manual approval, taking several minutes.
A solution to this problem is to use top picking. Use a staging table to dynamically allocate tasks to batch processes based on order size and priority.