Below is some of our guideline in developing stored procedures:
The stored procedure must be test-able & traceable - the PRINT keyword must be use and/or sufficient log records must be created for system administrator to trace the process.
The stored procedure must be able to re-run - we should not expect the stored procedure will be run once. It should be tested with the idea what-if the caller run it for multiple times. What will happen? Another way of thinking is that, if we run the same stored procedure multiple times with the same parameters, does it give the same output?
The stored procedure must be broke down into small piece of process and meet the above guideline - sometimes, it's unbelievable to see a stored procedure which contains 1000+ lines. This is not a good way in developing stored procedure. The best way is to modularize the processes and break it down into small processes. This will ease the testing and also future enhancement.
Stop hard-coding the constants in a stored procedure - some programmer wants to complete their job as soon as possible without really care the future maintenance. The best way to handle constant is to save the value in a table and load it when you need it.
Stop using "flag" parameter to instruct the stored procedure to run different process - you should not use any parameter as a flag to indicate running which process within the stored procedure. For example, flag=1 is to returns customer records and flag=2 is to update customer's outstanding balance. This is very confusing.
Naming convention - don't name the stored procedure the way you like. All stored procedure must follow the agreed naming convention. In fact, the stored procedure name should contains the main module code (as prefix) to ease the future maintenance.
Tuesday, June 26, 2012
Wednesday, June 20, 2012
Unique value for Primary Key
I used to defined the primary key in integer database until one day that I'm working on web project. I realized that the using integer as the data type for primary key might have some issues.
Issue #1 - passing the primary key in the URL. For example, the primary key value for customer A is "1001". If the user wants to edit the customer A's profile, you might use the query string to contain the primary key value so that the page can query the database and retrieve the appropriate record. This is very bad practice because the user will be able to change the query string and retrieve the customer profile that they want.
Issue #2 - bottleneck of inserting new record. For example, you are running system that requires handling many transactions per seconds. Whether you are using "identity" or generating the next ID value that stores in last ID table, it creates bottleneck while inserting new records. The bottleneck will become even worst if you have long process to be run in the triggers or stored procedure.
Issue #3 - multiple databases. In certain situation, you might have to run the same system in multiple locations and each location has it's own database. These databases requires to be merge into a huge centralized database at later time. Irregardless of the merging interval, by using integer data type in the primary key, you are facing a problem with a question: what is the appropriate range of values to be assign to each location?
After trial and error with integer primary key and nvarchar (i.e., add a "outlet code" field to make the composite key), we found an easier way to solve the above problem. Instead, we are using UNIQUEIDENTIFIER data type (or GUID in .Net framework).
When showing the UNIQUEIDENTIFIER value in the URL query string, it's hard for the user to guess the primary key value for other customer. The bottleneck problem also disappear because UNIQUEIDENTIFIER value can be generated at anytime by calling NEWID (MSSQL) function. Finally, merging the records from various database will not be a problem and you don't have to scratch your head to look for the appropriate range of values to be allocated.
Of course, UNIQUEIDENTIFIER data type comes with some costs. The first one is that it requires more storage space and the next cost is the searching takes more time. But, my argument is that the hard disk price is cheap and will be cheaper in the future. The searching time can be reduce with proper indexing or increase the hardware specification. Another cost is the data fragmentation which slows down the query. This happens when we are using NEWID() function to generate a new random value causing the record to be inserted into somewhere in the table. To solve the fragmentation, you will to rebuild or reorganise the index (you might have to google for the related commands).
In our database design, the primary key for master and child tables are UNIQUEIDENTIFIER data type and it is single field primary key. For the child table, the foreign key is not part of the primary key. The advantage is that it allows the DBA to retrieve any records easily because each record has it's own unique value.
Lastly, in MSSQL, the primary key is a clustered index. This means, inserting new records will cause the MSSQL engine to look for the appropriate page position. Some people will say this is fast and some deny it. Some DBA even say that because of the records were not stored in sorted order, it is more beneficial to the parallelism when reading records from hard disk.
All other potential factors must be consider carefully before implementing this idea.
Technical specification:
Database size that has implemented this idea: < 10GB.
MSSQL Server: 2005/2008.
Environment: Win2003, Win2008, Win XP/7 (even in the virtual PC).
Issue #1 - passing the primary key in the URL. For example, the primary key value for customer A is "1001". If the user wants to edit the customer A's profile, you might use the query string to contain the primary key value so that the page can query the database and retrieve the appropriate record. This is very bad practice because the user will be able to change the query string and retrieve the customer profile that they want.
Issue #2 - bottleneck of inserting new record. For example, you are running system that requires handling many transactions per seconds. Whether you are using "identity" or generating the next ID value that stores in last ID table, it creates bottleneck while inserting new records. The bottleneck will become even worst if you have long process to be run in the triggers or stored procedure.
Issue #3 - multiple databases. In certain situation, you might have to run the same system in multiple locations and each location has it's own database. These databases requires to be merge into a huge centralized database at later time. Irregardless of the merging interval, by using integer data type in the primary key, you are facing a problem with a question: what is the appropriate range of values to be assign to each location?
After trial and error with integer primary key and nvarchar (i.e., add a "outlet code" field to make the composite key), we found an easier way to solve the above problem. Instead, we are using UNIQUEIDENTIFIER data type (or GUID in .Net framework).
When showing the UNIQUEIDENTIFIER value in the URL query string, it's hard for the user to guess the primary key value for other customer. The bottleneck problem also disappear because UNIQUEIDENTIFIER value can be generated at anytime by calling NEWID (MSSQL) function. Finally, merging the records from various database will not be a problem and you don't have to scratch your head to look for the appropriate range of values to be allocated.
Of course, UNIQUEIDENTIFIER data type comes with some costs. The first one is that it requires more storage space and the next cost is the searching takes more time. But, my argument is that the hard disk price is cheap and will be cheaper in the future. The searching time can be reduce with proper indexing or increase the hardware specification. Another cost is the data fragmentation which slows down the query. This happens when we are using NEWID() function to generate a new random value causing the record to be inserted into somewhere in the table. To solve the fragmentation, you will to rebuild or reorganise the index (you might have to google for the related commands).
In our database design, the primary key for master and child tables are UNIQUEIDENTIFIER data type and it is single field primary key. For the child table, the foreign key is not part of the primary key. The advantage is that it allows the DBA to retrieve any records easily because each record has it's own unique value.
Lastly, in MSSQL, the primary key is a clustered index. This means, inserting new records will cause the MSSQL engine to look for the appropriate page position. Some people will say this is fast and some deny it. Some DBA even say that because of the records were not stored in sorted order, it is more beneficial to the parallelism when reading records from hard disk.
All other potential factors must be consider carefully before implementing this idea.
Technical specification:
Database size that has implemented this idea: < 10GB.
MSSQL Server: 2005/2008.
Environment: Win2003, Win2008, Win XP/7 (even in the virtual PC).
Tuesday, June 5, 2012
Raise an error that the C# can catch
Often, our program execute stored procedure to do some processes. In the event of the data is missing or violate the business rules, the stored procedure should stop the process and return a useful message to the caller.
In order for the C#/ASP.net to be able to catch the exception, you have to pass in the appropriate values into raiserror() function.
For example:
To capture the current error, please use the following code:
declare @err_msg nvarchar(max),
@err_severity int,
@err_state int
set @err_msg = error_message()
set @err_severity = error_severity()
set @err_state = error_state()
raiserror(@err_msg,@err_severity,@err_state)
In order for the C#/ASP.net to be able to catch the exception, you have to pass in the appropriate values into raiserror() function.
For example:
raiserror('This error will be able to catch by C#/asp.net', 11, 1)
Just beware that the second parameter should start from 11 or otherwise C#/ASP.net will not be able to catch it.
To capture the current error, please use the following code:
declare @err_msg nvarchar(max),
@err_severity int,
@err_state int
set @err_msg = error_message()
set @err_severity = error_severity()
set @err_state = error_state()
raiserror(@err_msg,@err_severity,@err_state)
Subscribe to:
Posts (Atom)