Tuesday, July 10, 2012

Designing a stoppable long running process

Some processes are running at least 30 minutes. The question is how to design the long running process to be stoppable and upon restart it continues from where it left in the previous process? This is an interesting question. Unfortunately, most of the long running processes that I have heard are not stoppable and requires manual patching the 'corrupted' output before re-run the process.

For example, the process requires to issue monthly bill/invoice on the web hosting. This web hosting company has more than 10,000 customers and each of the customer are subscribing different services.

Reason for stopping the long run process:
  • The hard disk is running out of space and the process has stopped.
  • There are some new subscriptions to be queued into the bill/invoice.
  • Server has to be restarted due to OS (Operating System) updates or running out of memory.
  • For whatever unforeseen reason.
First thing first, we need a table to store the records to be process. You might have to customize this table so that it can cater for your need. The batch_no field is a text field which you can design it to store multi-locations or multi-companies billing process.

create table tb_bill_process_log (
    bill_process_log uniqueidentifier not null primary key,
    sess_id uniqueidentifier not null,  -- this is the session ID of the process.
    batch_no nvarchar(50) not null,  -- this is the batch number.
    customer_id uniqueidentifier not null,  -- this is the customer ID.
    process_start_on datetime,  -- the process start time.
    process_end_on datetime,  -- the process completion time.
    remarks nvarchar(255),  -- the remarks set by the long running process.
    created_on datetime  -- the record creation date/time.
)

create table tb_bill_process_batch (
    bill_process_batch_id uniqueidentifier not null primary key,
    batch_no nvarchar(50) not null,  -- the batch number.
    process_restart_counter int,  -- increment this value everytime the process restarted.
    start_on datetime,  -- the first process start date/time.
    complete_on datetime, -- the process completion date/time.
    created_on datetime  -- the record creation date/time
)

The flow of the long running process:
  1. Upon the long running process starts, you should assign a batch number (stores in batch_no field). If the batch number has been created, increment the tb_bill_process_batch. process_restart_counter value by one.
  2. Then, copy all the customer who is still subscribing the services or the subscription has been terminated in the last month. All these customers will receive the bill/invoice. If possible, do this within a database transaction. If there are too many customers to be bill, you might have to split the customers by group or location. This reduces the time to lock all the necessary resources.
  3. Now, the billing process starts (this is the process of calculating the charges). You need a cursor where it is selecting the customer ID base on the batch_no + process_start_on is null.
  4. After retrieved the first customer ID value from the cursor, set the process_start_on to current date/time and insert the necessary billing records. Upon completion, update the process_end_on to current date/time. In this step, you need a database transaction to ensure that all records are committed at the same scope.
  5. Then, continue with the next customer ID until all records have been processed.
  6. Finally, update tb_bill_process_batch.complete_on field to indicate that the long running process has completed.
That's all you need to design a stoppable long running process. If you think this is not enough and would like to reduce the process time, this is what you can do:

Develop a program in whatever programming language that supports multi-threading and execute the charges calculation stored procedure in multiple threads. To do this, you need to split the long running process stored procedures into a few sub-processes:
  • The first stored procedure is responsible to generate a new batch number and collect all the customer ID that requires to be bill (step 1 and 2). This stored procedure will be run by 1 thread only.
  • The second stored procedure is responsible for calculating the charges for each customer (step 3 to 6). This stored procedure will be run in multiple threads. In this stored procedure, you can't loop the cursor anymore. The cursor must be replace with a query that always returns the TOP 1 customer ID by matching the batch_no + process_start_on is null.
Here you go. If you are running 2 threads for the charges calculation stored procedure, the entire process time will be reduce by <50%. Please take note that it won't be exactly 50% of reduction because of the process requires to lock the necessary resources.

No comments:

Post a Comment