Vishal SQL Blog: 2005

Thursday, September 29, 2005

Reading XML Doc in SQL Server 2000

XML Doc to Process

Products Table Structure:

ProductID
CategoryID
Description

Reading XML doc inside sp

CREATE PROCEDURE spAddCatalogItems (
@xmlDocument varchar (8000))
AS
BEGIN
DECLARE @docHandle int
EXEC sp_xml_preparedocument @docHandle OUTPUT, @xmlDocument
INSERT INTO Products
SELECT * FROM OPENXML (@docHandle, ‘/ROOT/Category/Product’, 1)
WITH (ProductID int ‘./@ProductID’,
CategoryID int ‘../@CategoryID’,
[Description] varchar (100) ‘./@Description’)
EXEC sp_xml_removedocument @docHandle
END

Objects Used

sp_xml_preparedocument
OPENXML
sp_xml_removedocument

This code is very handy, When you are getting more then one records to process from the UI.
This can be used as standard to get the xml from DAL & process at SQL end ....(By adding exception handling & return output variable.)
If you wanna go crazy then you can further return xml as output parameter too. Sounds cool :)

- Vee

SYSTEM_USER

Allows a system-supplied value for the current system username to be inserted into a table when no default value is specified.

DECLARE @sys_usr char(30)
SET @sys_usr = SYSTEM_USER
SELECT 'The current system user is: '+ @sys_usr
GO
Here is the result set:
----------------------------------------------------------
The current system user is: sa

(1 row(s) affected)

Friday, July 29, 2005

All about Locking in SQL 2000

full article coming soon...

Four aspects of locking

Mode of locking
Granularity of the lock
Duration of the lock
Ownership of the lock

Lock Modes

Shared Locks
Exclusive Locks
Update Locks
Intent Locks

Lock Granularity

at the table, page, or row level. SQL Server also locks index keys and ranges of index keys

Lock Duration

depends on the mode of the lock and the transaction isolation level

Lock Ownership

Transactions
Cursors
Sessions

Other related terms

Few useful system tables for locks

syslockinfo table

Few useful commands sp for locks

sp_getapplock
sp_releaseapplock
sp_lock

Thursday, July 28, 2005

SQL: Scale Up vs. Scale Out

SQL: Scale Up vs. Scale Out

Scaling up refers to moving an application to a larger class of hardware that uses more powerful processors, more memory, and quicker disk drives. Scaling out refers to an implementation of federated servers, where consumer-class computers are added and where data is then partitioned or replicated across them. You can scale out by using functional partitioning. For example, you might scale out by putting your Customer Relationship Management (CRM) functionality on one server and your Enterprise Resource Planning (ERP) functionality on another server. Or, you could scale out by using data partitioning. For example, you might scale out by creating updatable partitioned views across databases. Do not consider scaling up or scaling out until you are certain that you are getting the best performance that you can through application optimization. Consider the following scenarios when it comes to addressing two common scalability bottlenecks:

* Processor and memory-related bottlenecks: Scaling up is usually a good approach if your bottlenecks are processor related or memory related. By upgrading to a faster processor or by adding more processors, you maximize use of your existing hardware resources. You can resolve memory bottlenecks by adding additional memory or by upgrading existing memory. The /3GB switch in the Boot.ini file and Address Windowing Extensions (AWE) also help maximize
memory use.

*Disk I/O-related bottlenecks : Scaling up can also help to resolve disk I/O-relatedbottlenecks. This form of bottleneck usually occurs in online transaction processing (OLTP) applications where an application performs random disk reads and writes, in contrast to sequential access in online analytical processing (OLAP) applications. For OLTP applications, the I/O load can be spread by adding disk drives. Adding memory also helps reduce I/O load. Because the I/O load is reduced, the size of the SQL Server buffer cache increases. As a result, page faults are reduced.

Consider the following guidelines before you decide to scale up or scale out:

* Optimize the application before scaling up or scaling out.
* Address historical and reporting data.
* Scale up for most applications.
* Scale out when scaling up does not suffice or is cost-prohibitive.

Optimize the Application Before Scaling Up or Scaling Out

Before you decide to scale up or to scale out, you need to be sure that it is required.
Scaling out works best when you plan and design for it in the early stages of your
application development life cycle. Changing your application after it is in production so that you can scale up or scale out is expensive. In addition, certain initial design decisions that you make may prevent you from scaling out later.

You can resolve most performance and scalability issues by performing the optimization steps that are outlined in the rest of this chapter. These optimizations help reduce the impact of bottlenecks that are caused by specific design or implementation techniques. These optimizations also help ensure that existing resources are fully utilized. For example, with optimization, you can resolve bottlenecks that are caused by inefficient locking, unprepared SQL statements, poor indexes that lead to increased CPU utilization, and memory or disk I/O utilization.

In practice, you need to simulate your data usage and growth early in the application life cycle by using a realistic workload. Simulating your data usage and growth helps you identify scalability issues sooner rather than later so that you can modify your design and approach to mitigate those issues.

Address Historical and Reporting Data

Historical data may become very large over time and may cause long-running queries. Consider partitioning historical data by some range, and implement a way to limit older data. Either move the older data offline, or implement a separate data warehouse that contains the older data.

Reporting needs may also be very resource intensive. You may consider upgrading your database server or scaling out to meet your reporting needs. By implementing a data warehouse or a reporting server, you may be able to provide faster response times and less resource contention. Additionally, a data warehouse or a reporting server is easier to manage than multiple servers in a federated server scheme.

Scale Up for Most Applications

If you still have high levels of system resource use after you tune your application and after you address historical and reporting data issues, consider replacing slow hardware components with new, faster components. Or, consider adding more hardware to your existing server.

High levels of system resource use include high CPU utilization, high memory use, and excessive disk I/O. The new components you might add include additional processors or memory. Alternatively, consider replacing your existing server with a new, more powerful server.

Ensure that any configuration changes take full advantage of the new hardware. For example, you may need to use the /3GB switch in the Boot.ini file. This is an easy next step for both migration and maintenance reasons. You should perform tests to help determine the new server capacity that you require.

Scale Out When Scaling Up Does Not Suffice or Is Cost-Prohibitive

If your application still does not perform well enough, you can consider scaling out or implementing a federated servers option. These approaches usually require certain tables to be horizontally partitioned so that they reside on separate servers. The approaches may also require some replication between servers of the main domain tables that also have to be available on a partition.

Disaster recovery and failover are also more complex for federated servers. You have to determine if the benefit of this added complexity outweighs the cost advantage of being able to use consumer-class computers for federated servers.

About Log Shipping

What is Log Shipping

Essentially, log shipping is the process of automating the backup of database and transaction log files on a production SQL server, and then restoring them onto a standby server. But this is not all. The key feature of log shipping is that is will automatically backup transaction logs throughout the day (for whatever interval you specify) and automatically restore them on the standby server. This in effect keeps the two SQL Servers in "synch". Should the production server fail, all you have to do is point the users to the new server, and you are all set. Well, its not really that easy, but it comes close if you put enough effort into your log shipping setup.

Benefits of Log Shipping

While I have already talked about some of the benefits of log shipping, let's take a more comprehensive look:
1. Log shipping doesn't require expensive hardware or software. While it is great if your standby server is similar in capacity to your production server, it is not a requirement. In addition, you can use the standby server for other tasks, helping to justify the cost of the standby server. Just keep in mind that if you do need to fail over, that this server will have to handle not one, but two loads. I like to make my standby server a development server. This way, I keep my developers off the production server, but don't put too much work load on the standby server.
2. Once log shipping has been implemented, it is relatively easy to maintain. 3.Assuming you have implemented log shipping correctly, it is very reliable.
4. The manual failover process is generally very short, typically 15 minutes or less.
5. Depending on how you have designed your log shipping process, very little, if any, data is lost should you have to failover. The amount of data loss, if any, is also dependent on why your production server failed.

Problems with Log Shipping

Let's face it, log shipping is a compromise. It is not the ideal solution, but it is often a practical solution given real-world budget constraints. Some of the problems with log shipping include:

1. Log shipping failover is not automatic. The DBA must still manually failover the server, which means the DBA must be present when the failover occurs.
2. The users will experience some downtime. How long depends on how well you implemented log shipping, the nature of the production server failure, your network, the standby server, and the application or applications to be failed over.
3. Some data can be lost, although not always. How much data is lost depends on how often you schedule log shipping and whether or not the transaction log on the failed production server is recoverable.
4. The database or databases that are being failed over to the standby server cannot be used for anything else. But databases on the standby server not being used for failover can still be used normally.
5. When it comes time for the actual failover, you must do one of two things to make your applications work: either rename the standby server the same name as the failed production server (and the IP address), or re-point your user's applications to the new standby server. In some cases, neither of these options is practical.

For full more details

Building Reliable, Asynchronous Database Applications Using Service Broker

Roger Wolter

February 2005
Updated June 2005

Applies to:
Microsoft SQL Server 2005 Service Broker

Summary: Microsoft SQL Server 2005 Service Broker is a new platform for building distributed asynchronous database applications. Including an asynchronous, reliable messaging feature in the SQL Server database makes it possible to build a variety of database applications that were difficult, if not impossible, to build before. Service Broker and the other developer features in SQL Server 2005, such as common language runtime (CLR) integration and the XML data type, combine to make SQL Server 2005 a great platform for building highly scaleable database applications. (13 printed pages)

Contents
Introduction
Why Write Asynchronous, Queued Applications?
Why are Queued Applications Hard to Write?
Queuing at the Grocery Store
Queuing at the Airport
Service Broker Programming
Why Do Messaging in the Database?
Scenarios
Conclusion

Introduction

One of the major pushes in the development of Microsoft SQL Server 2005 was to enable reliable, scalable, and functionally rich database applications. CLR (common language runtime) integration made it possible for developers to incorporate significant business logic into stored procedures and new functionality in Transact-SQL and XML expanded the available range of data manipulation and storage functions available to the developer. The other significant feature in this area is SQL Server Service Broker, which adds reliable, scaleable, distributed, asynchronous functionality to database applications.

Why Write Asynchronous, Queued Applications?

When we talked to people who have successfully built large, scalable database applications, we found that almost all of their applications had one or more operations that were executed in an asynchronous, queued manner. Stock trading systems queue the settlement activity so it can execute in the background while the front end goes on to handle other trades. Order entry systems put the shipping information in a queue to be read later by a shipping system that is running on another server—often in another location. Travel booking systems make the actual reservations after the customer has completed the itinerary and a confirmation e-mail message is sent after the reservations are made. In all these cases, the responsiveness of the system is improved by doing much of the work asynchronously so the interactive user doesn't have to wait for it to complete before receiving a response.

In most large systems, a careful analysis will uncover parts of the processing that can be done asynchronously. While the queued parts of the application don't need to be done immediately, the system has to ensure that these tasks are done reliably. Service Broker makes the execution of asynchronous, queued tasks reliable and simple to implement.

Another advantage to making parts of your application asynchronous is that this part of the processing can be done when resources are available. If the shipping tasks in an order entry system are executed from a queue, the shipping system doesn't have to be able to handle the peak order load. In many order entry systems, the peak order rate might be two or three times the average order rate. Since the order entry system must be able to handle the peak load, at slack times there is a significant amount of unused processing power available. If asynchronous tasks are queued during peak times and executed during slack times, system usage efficiency is significantly increased.

Why are Queued Applications Hard to Write?

If asynchronous, queued applications have so many advantages, why aren't all applications written that way? Because writing them is hard! Many developers that set out to use a database table as a queue in their application find out that it's a lot harder than it looks. A table that is used as a queue has multiple processes simultaneously inserting, reading, and deleting a small number of records. This leads to concurrency issues, performance challenges, and frequent deadlocks. While many developers have successfully met this challenge, it's a lot harder than it should be. Service Broker addresses these issues by making Queues first class database objects in the SQL Server 2005 database. Most of the gnarly issues associated with writing queues are already handled so the developer can concentrate on writing asynchronous applications rather than writing the queuing infrastructure. The rest of this section goes into the hard problems involved in writing queued applications and explains how Service Broker helps resolve them.

Message Integrity

In many asynchronous, queued applications, the queued messages are actually valuable business objects. For example, in an order entry system, if you put shipments in a queue to be processed later, losing the queued data means that orders won't get shipped. Many reliable messaging systems persist messages to the file system so the loss of a disk drive means the messages are lost. Service Broker messages are persisted in hidden database tables, so all the data integrity features that SQL Server offers for protecting your data work equally well for protecting your queued messages. If you are using database mirroring for disaster recovery, when your database fails over to the secondary site, all your messages are already there and the Service Broker applications continue to run with no data loss.

Multi-Reader Queues

Multi-reader queues are one of the most effective means of scaling out asynchronous applications. To demonstrate this, I would like to show how queuing works in a couple scenarios we're all familiar with.

Queuing at the Grocery Store

Most grocery stores scale out with multiple queues. Each checkout counter has its own queue so when you want to buy your groceries, you have to pick one. If you are like me, most of the time the queue you pick will be the one behind the cart that requires a price check on every other item and the customer who is paying with a post-dated third-party check. Customers that were still at home when you got in line will be done paying for their groceries before you get to the front of your line. This demonstrates one problem with scaling out by using multiple queues—a task queued behind a long running task doesn't get serviced promptly.

The other major issue with multiple queues is that adding a queue requires the rebalancing of tasks between queues, which can lead to a lot of wasted movement between queues. Think of the grocery cart demolition derby that happens when a new checkout counter opens.

Queuing at the Airport

While I hesitate to use airlines as an example of how to do things efficiently, the way most ticket counters work is a better model for efficient queuing than the grocery store. This is because multiple ticket agents service the same queue. There is only one queue so you don't have to worry about getting in the wrong one. If a particular passenger takes a long time, the other ticket agents can continue to service the line (assuming there is more than one ticket agent).

The single queue with multiple readers also scales without difficulty. If the line is getting too long, other agents can be added without disrupting the line. Agents can also leave after servicing their current passenger without causing too much chaos in the queue.

At the risk of stretching this analogy too far, we can use the airport queue to illustrate another common issue with queue-based applications. Think about what happens when several people in the line are part of a group. For example, my family is checking in for a trip. The family is scattered throughout the line because we all arrived at different times. If we want to sit together, an agent will have to reserve a block of seats. If my wife and I end up at different ticket agents at the same time, I might reserve five seats in row 4 and my wife five seats in row 47. This is one of the key problems with multi-reader queues—if related messages are processed on different threads simultaneously, coordination can be difficult. Think of an order header and an order line being processed simultaneously. The reader processing the order line would think there was no header for the order because it wasn't in the database yet. In order to work correctly, the order line would have to retry the check for the order header multiple times to ensure that it was delayed and not missing.

An easier way to make this work would be if the first person to get to a ticket agent called all of the related passengers to the front so they could all be processed by the same ticket agent. Service Broker does a similar thing by putting a lock on related messages when one of the messages is received. The reader holding the lock can receive any of the messages in the queue that belong to the same group but no other reader can read one of them. The lock is held until the transaction it was taken in commits. This lock is called a conversation group lock. A conversation group is a developer-defined grouping of related messages. For example, all of the dialogs necessary to process a particular order—order header, order lines, inventory, shipping, billing, etc.—might be put into the same conversation group. When a message from one of these conversations is read, a lock is placed on the group so that only the reader holding the lock can process any of the related messages in the queue. It's important to note that only the messages from a single group are locked. There may be hundreds of groups being processed simultaneously but each group is processed on only one thread at a time. The lock lasts until the transaction that it was created under commits or rolls back.

The last issue I would like to illustrate with this analogy is what happens when another message from a conversation group arrives after the transaction holding the conversation group commits. In the case of the ticket counter analogy, this would be one of my kids getting to the airport after all the rest of us have checked in. Since the initial transaction is over, the last passenger may get processed by any of the ticket agents. The only way the new agent will know where to seat this passenger is if the initial agent left a note indicating where the rest of the party was seated. In the same way, once a transaction that processes a related group of messages is complete, it must record the "state" of the conversation so that when the next message belonging to this group arrives, the queue reader that receives the message will know where the previous transaction left off. Since this is a database application, the natural place to store this state is in a database table. Service Broker provides a handy way to tie the state of the conversation to the messages in the conversation—the conversation group ID. This is a uniqueidentifier that appears with every message in the conversation group. If this uniqueidentifier is used as a key in the table where the state is stored, it's easy for the message processing logic to find the state that is associated with every message it receives. Also, because only one queue reader can handle messages from a particular conversation group at a time, the developer doesn't have to worry about a state row being updated by two transactions at the same time, thereby causing the loss of state information.

As you can see from these examples, a multi-reader queue is a simple, effective way to scale out a large application. The conversation group locking mechanism provided by Service Broker makes writing applications that use multi-reader queues as easy to write as applications that use a single-reader queue.

Distribution

So far, we have been talking about queues as if they all exist in a single database. To build the kind of loosely coupled, distributed database applications required for many business scenarios, we have to expand this to include queues in many databases scattered across a network that is communicating through reliable messaging. We need reliable messaging because it doesn't make sense to use a database to ensure the integrity of messages in a queue and then take a chance on losing them when they are transferred to another database.

Service Broker uses a reliable messaging protocol called the dialog protocol to ensure that messages sent to a remote queue arrive exactly once and in order. Just as dialogs are bidirectional conversations, the dialog protocol supports message transfer in both directions simultaneously.

Dialog messages have a header that ensures that the message gets delivered securely to the proper destination in the correct order. It contains a sequence number, an identifier for the dialog it is in, the name of the service it is being sent to, security information, and some other information that is used to control message delivery. When a destination successfully receives a message, it acknowledges receipt of the message so that the source knows it was delivered successfully. Whenever possible, this acknowledgement is sent in the header of another message going back to the source so the number of messages is minimized. If the source doesn't receive an acknowledgement within a time limit, the message is resent until it is delivered successfully.

Message delivery systems often have problems delivering large messages. A gigabyte-sized message can take several minutes to send, which can effectively tie up a network connection for a significant amount of time. If a network error causes the message to be resent multiple times, network performance can be severely affected. The Service Broker dialog protocol deals with this issue by splitting very large messages up into several smaller fragments which are sent individually. If a network error causes a resend, only the message fragment that failed is resent. This is the reason Service Broker is able to support 2-GB message sizes while many reliable messaging systems can only send messages of 100 MB or less.

Transactional Messaging

"Exactly once" message processing requires transactional messages. To see why this is true, think about an application that crashes part way through processing a message. When the application restarts, how does it know whether to process the message it was processing when it crashed? The database may have already been updated with the results of the message processing already so that reprocessing the message may duplicate the data. As you can see, the only safe way to handle this is to make receiving the message part of the same transaction that updates the database. That way, if the system crashes, both the database update and the message receive are rolled back so the database and the message queue are in the same state as they were before the crash.

Because all Service Broker operations take place in the context of a database transaction, the transactional integrity of messaging operations is preserved. A typical Service Broker message-processing transaction will have the following steps:

1. Begin transaction.
2. Receive one or more messages from a conversation group.
3. Retrieve the state of the conversation from the state tables.
4. Process the messages and make one or more updates to application data based on the message contents.
5. Send out some Service Broker messages—either responses to incoming messages or messages to other services required to 6. process the incoming message.
7. Read and process more messages for this conversation group if any are available.
8. Update the conversation state tables with the new state of the conversation.
9. Commit the transaction.

The powerful part of Service Broker transactional messaging is that if the system crashes or the application fails, the transaction rolls back and everything is back to the way it was when the transaction started—the state is unchanged, the application data is unchanged, no messages are sent, and the received messages are back on the queue. This makes error handling in this type of application very straightforward.

Queue-Reader Management

Message processing in a Service Broker application starts when the queue reader receives a message from a queue. Since messages are always pulled from a queue, the receiving application must be running when messages arrive in a queue. This is an issue with many asynchronous messaging applications—how do you ensure that the queue reader will be running when it is required? The two traditional approaches have been to make the queue reader a service that runs continuously or to use triggers that the messaging system fires when each message arrives. The Microsoft Windows service approach means that an application is running even when there are no messages to process. The trigger approach can have performance issues because the queue reader starts and stops frequently.

Service Broker takes a middle ground approach to queue-reader management called activation. To set up activation, the database administrator (DBA) associates a stored procedure with a Service Broker queue. When the first message arrives in the queue, the activation logic will start the specified stored procedure. The stored procedure is responsible for receiving and processing messages until the queue is empty. Once the queue is empty, the stored procedure can terminate to save resources.

If Service Broker determines that messages are being added to the queue faster than the stored procedure is able to process them, the activation logic will start additional copies of the stored procedure until either the stored procedures keep up with the incoming rate or the maximum number of stored procedures that are configured for the queue is reached. Since the number of queue readers servicing the queue expands and contracts as the incoming message rate changes, the right number of queue readers will be running at all times.

Service Broker Programming

Service Broker programming is designed to be familiar to a database programmer. Configuring a Service Broker application is done with the familiar CREATE, ALTER, and DROP data definition language (DDL) statements that are used to configure other database objects. The commands to create Service Broker dialogs and to send and receive messages on them are Data Manipulation Language (DML) extensions to the Transact-SQL language. The receive command syntax is similar to a select command and it returns a rowset containing messages, just as a select command returns a rowset containing rows. Developers accustomed to Transact SQL programming will find it very easy to learn Service Broker programming. The client APIs that are used to program the Service Broker are the same as the APIs used for all database programming—OLE DB, ODBC (Open Database Connectivity), ADO (ActiveX Data Objects), ADO.NET, and so on.

Why Do Messaging in the Database?

One of the questions frequently asked about Service Broker is, "Why build messaging into the database? Aren't there already enough reliable messaging systems out there?"

I hope that the information presented in this paper has helped to explain the decision to build Service Broker into the database engine, but here are a few more reasons why it makes sense to have messaging in the database:

* Single client connection for messages and data. In addition to the unified programming model mentioned in the previous section, this offers some other substantial advantages:
* An application can receive messages transactionally when running on any client that can connect to the database. Many messaging systems only allow transactional receives when the receiver is running on the same computer as the queue.
* Transactional messaging doesn't require a distributed transaction or two-phase commit, unlike message systems that don't store messages in the database.
Integrated management, deployment, and operations between data and messages. All the tools and techniques you use to safeguard and manage your database data apply equally well to messages:
Backing up and restoring the database also backs up and restores queued messages.
If you use clustering or database mirroring to protect your database from failures, your messages enjoy the same protection.
Since queues have relational views available, it's simple to find out what's happening in the queue. Want to know how many messages are in the queue? Select count(*) from queue. Want to know which messages haven't been delivered yet? Select * from sys.transmission_queue. If your messages contain XML data, you can use XQuery to search them. You can join messages in queue with state information and even data tables to find out the complete status of a particular order in your order entry system.
There are also some significant performance advantages for messaging built into the database.
As previously mentioned, two-phase commits are not required for transactional messages.
The message updates, the state changes, and the data updates are all logged in the same transaction log so only a single log write is required to commit the transaction.
Reliable message delivery usually transfers the message from a transmit queue to a receive queue. If Service Broker detects that the receive queue is in the same database instance as the transmit queue, the message is put directly onto the receive queue, thereby saving extra I/O and a transaction commit.

Scenarios

Now that we've seen how Service Broker works, let's look at a few of the many possible applications that you can build with Service Broker.

Order Entry

Order entry is a commonly used scenario because it's one process that just about everyone understands. Even developers who haven't worked on an order entry system have at least used one to order something.

The Service Broker-based order entry system uses queues to connect the subsystems of the system together. This not only increases parallelism and improves throughput, but it also provides a great deal of flexibility in configuration and architecture.

In this scenario, Service Broker is used to connect four loosely coupled services used to process each order. As the order entry service inserts the order headers and order lines into the database, it queues messages to the billing, shipping, inventory, and credit limit services to finish processing the order. Service Broker allows all four services to run in parallel, improving the response time of the system.

Depending on the business requirements of the system, the order entry service can either wait for responses from all four services before it returns to the user, or it can return as soon as the initial transaction is committed and let the rest of the services execute in the background. Implementing either behavior is a minor change to the order entry service. The other services are the same in either case.

Using Service Broker to link the services also offers several deployment options. All five services can run on the same server or they can be split up across up to five servers (or more if services are load balanced) as required to provide the required throughput. The service can also be deployed as stored procedures or as external applications. This flexibility is especially attractive to Independent Software Vendors (ISVs), who can create a single code base and deploy it in a wide variety of configurations depending on the customer's performance, redundancy, and throughput requirements.

Figure 1 illustrates the relationships among the queues and services that constitute the loosely coupled order entry system.

Figure 1. Loosely coupled order entry system

Parallel Stored Procedure

When triggers were first invented, database companies used to talk about using a trigger on an inventory table to automatically place a purchase order when the inventory level dropped too low. While this was a great concept, very few systems use it because the extra overhead of executing a lot of code in a trigger makes the database update too slow. Service Broker makes this kind of application possible by allowing the trigger to queue the work to be done so that the trigger can complete and allow the original transaction to commit. The queued work is then done later in a separate transaction—maybe even in a different database. SQL Server 2005 uses this model for query notifications and event notifications.

Service Broker also allows a stored procedure to start several other stored procedures in parallel. This can significantly improve response time. For example, consider a call center application that uses caller-ID on incoming calls to look up all the information about the customer who is calling so that the customer service representative has all the relevant information available. In many cases, this information must be obtained from different databases on different systems. Using remote queries to obtain all this information is possible but the response time may not be acceptable. The Service Broker can queue requests to all the remote services simultaneously and wait for the results in its input queue. Since all the requests are processed in parallel, the overall response time is improved. Figure 2 illustrates this processing.

Figure 2. Parallel stored procedure

Batch Processing

One of the significant use cases for Service Broker applications is a large batch-processing system. Most batch processes are made up of many small, semi-independent processes that must be scheduled and coordinated. Independent execution of sub processes improves throughput by allowing each subsystem to execute at its own optimal pace.

In the example illustrated in Figure 3, the inputs to the batch planning process—orders, forecasts, returns, etc.—are accumulated in input queues throughout the day. When the planning engine runs, it reads inputs from the queues, analyzes them, and then queues requests to the sub processes that process the plan outputs. The output queues allow the sub processes to execute independently, in parallel on one server or on a number of servers. This allows the process to scale out to as many servers as are required to handle the processing load.
Figure 3. Batch processing

Travel Booking

In a previous life, I taught a distributed database class for a Master's degree program. One example I used for distributed transactions was a travel agent that booked plane reservations, hotel rooms, and rental cars as part of the same transaction because the customer wouldn't want to book the hotel if the airline trip wasn't available. Like many examples, this worked until a programmer from a real travel agency showed up in class one day. He told me that no real travel agency could work this way. Airlines won't let anyone hold locks on their reservation tables and seat maps while an agent looks for a hotel reservation. Reservations are made based on current availability and if the seat or room is gone when the real reservation is made, the customer is called back to make sure the new reservation is OK.

In the Travel Booking Web Site scenario, bookings are made based on data in the flight and hotel availability tables. These tables are populated with information from the airlines and hotels. The information is updated frequently but it is always a little out of date. The actual booking doesn't take place until after the customer has completed booking his trip and possibly even after he has logged off.

This kind of delayed activity is ideally suited to the Service Broker architecture. The transaction that records the customer's travel arrangements commits messages to back-end services that do the actual booking. The booking services read messages from the queue and process each booking in a separate transaction. The booking service communicates through a variety of protocols to the systems that book the airline and hotel reservations. These communications might include XML Web services, SNA (Systems Network Architecture), HTTP, EDI (Electronic Data Interchange), faxes, Service Broker, etc. Because the input to the booking service is queued, the different latencies of these protocols aren't a problem. If the queue gets too long, Service Broker activation can start more queue readers to handle the load. If the volume reaches a point where the booking servers can't keep up, more servers can be added simply by adding rows to the routing tables.

When all the reservations for a particular itinerary have been booked successfully, a message is queued to the SQL Mail server to send a confirmation e-mail message to the customer. If one or more of the reservations fails, a customer service representative is notified to help the customer rebook the trip.

The Process Updates service runs in the background and periodically receives availability information from hotels and airlines. This information is massaged into a common format and then published to the servers in the Web farm through Notification Services to update their availability tables.

The loosely coupled Service Broker architecture allows Web servers to be added to the server farm by loading the software, configuring the database, and subscribing to the availability feeds. This can all be scripted so it happens with a minimal amount of manual intervention. To add more back-end servers to the booking services, the service must be installed and configured and then the new server address must be added to the routing tables in the Web servers.
Figure 4. Travel booking

Conclusion

SQL Server 2005 Service Broker is a new platform for building distributed asynchronous database applications. Including an asynchronous, reliable messaging feature in the SQL Server database makes it possible to build a variety of database applications that were difficult, if not impossible, to build before. Service Broker combined with the other developer features in SQL Server 2005 such as CLR integration and the XML data type, make SQL Server 2005 a great platform for building highly scaleable database applications.

Table Variables In T-SQL

Microsoft introduced table variables with SQL Server 2000 as an alternative to using temporary tables. In many cases a table variable can outperform a solution using a temporary table, although we will need to review the strengths and weaknesses of each in this article.

Table variables store a set of records, so naturally the declaration syntax looks very similar to a CREATE TABLE statement, as you can see in the following example:

DECLARE @ProductTotals TABLE
(
ProductID int,
Revenue money
)

While connected to the Northwind data-base, we could write the following SELECT statement to populate the table variable

INSERT INTO @ProductTotals (ProductID, Revenue)
SELECT ProductID, SUM(UnitPrice * Quantity)
FROM [Order Details]
GROUP BY ProductID

You can use table variables in batches, stored procedures, and user-defined functions (UDFs). We can UPDATE records in our table variable as well as DELETE records

UPDATE @ProductTotals
SET Revenue = Revenue * 1.15
WHERE ProductID = 62

DELETE FROM @ProductTotals
WHERE ProductID = 60

SELECT TOP 5 *
FROM @ProductTotals
ORDER BY Revenue DESC

You might think table variables work just like temporary tables (CREATE TABLE #ProductTotals), but there are some differences.

Scope

Unlike the majority of the other data types in SQL Server, you cannot use a table variable as an input or an output parameter. In fact, a table variable is scoped to the stored procedure, batch, or user-defined function just like any local variable you create with a DECLARE statement. The variable will no longer exist after the procedure exits - there will be no table to clean up with a DROP statement.

Although you cannot use a table variable as an input or output parameter, you can return a table variable from a user-defined function – we will see an example later in this article. However, because you can’t pass a table variable to another stored procedure as input – there still are scenarios where you’ll be required to use a temporary table when using calling stored procedures from inside other stored procedures and sharing table results.

The restricted scope of a table variable gives SQL Server some liberty to perform optimizations.

Performance

Because of the well-defined scope, a table variable will generally use fewer resources than a temporary table. Transactions touching table variables only last for the duration of the update on the table variable, so there is less locking and logging overhead.

Using a temporary table inside of a stored procedure may result in additional re-compilations of the stored procedure. Table variables can often avoid this recompilation hit. For more information on why stored procedures may recompile, look at Microsoft knowledge base article 243586 (INF: Troubleshooting Stored Procedure Recompilation).

Other Features

Constraints are an excellent way to ensure the data in a table meets specific requirements, and you can use constraints with table variables. The following example ensures ProductID values in the table will be unique, and all prices are less then 10.0.

DECLARE @MyTable TABLE
(
ProductID int UNIQUE,
Price money CHECK(Price < 10.0 )

You can also declare primary keys. identity columns, and default values

DECLARE @MyTable TABLE
(
ProductID int IDENTITY(1,1) PRIMARY KEY,
Name varchar(10) NOT NULL DEFAULT('Unknown')
)

So far it seems that table variables can do anything temporary tables can do within the scope of a stored procedure, batch, or UDF), but there are some drawbacks.

Restrictions

You cannot create a non-clustered index on a table variable, unless the index is a side effect of a PRIMARY KEY or UNIQUE constraint on the table (SQL Server enforces any UNIQUE or PRIMARY KEY constraints using an index).

Also, SQL Server does not maintain statistics on a table variable, and statistics are used heavily by the query optimizer to determine the best method to execute a query. Neither of these restrictions should be a problem, however, as table variables generally exist for a specific purpose and aren’t used for a wide range of ad-hoc queries.

The table definition of a table variable cannot change after the DECLARE statement. Any ALTER TABLE query attempting to alter a table variable will fail with a syntax error. Along the same lines, you cannot use a table variable with SELECT INTO or INSERT EXEC queries. f you are using a table variable in a join, you will need to alias the table in order to execute the query.

SELECT ProductName, Revenue
FROM Products P
INNER JOIN @ProductTotals PT ON P.ProductID = PT.ProductID

You can use a table variable with dynamic SQL, but you must declare the table inside the dynamic SQL itself. The following query will fail with the error “Must declare the variable '@MyTable'.”

DECLARE @MyTable TABLE
(
ProductID int ,
Name varchar(10)
)

EXEC sp_executesql N'SELECT * FROM @MyTable'

It’s also important to note how table variables do not participate in transaction rollbacks. Although this can be a performance benefit, it can also catch you off guard if you are not aware of the behavior. To demonstrate, the following query batch will return a count of 77 records even though the INSERT took place inside a transaction with ROLLBACK.

DECLARE @ProductTotals TABLE
(
ProductID int ,
Revenue money
)

BEGIN TRANSACTION

INSERT INTO @ProductTotals (ProductID, Revenue)
SELECT ProductID, SUM(UnitPrice * Quantity)
FROM [Order Details]
GROUP BY ProductID

ROLLBACK TRANSACTION

SELECT COUNT(*) FROM @ProductTotals

Choosing Between Temporary Tables and Table Variables

Now you’ve come to a stored procedure that needs temporary resultset storage. Knowing what we have learned so far, how do you decide on using a table variable or a temporary table?

First, we know there are situations that which demand the use of a temporary table. This in-cludes calling nested stored procedures which use the resultset, certain scenarios using dy-namic SQL, and cases where you need transaction rollback support.

Secondly, the size of the resultset will determine which solution to choose. If the table stores a resultset so large you require indexes to improve query performance, you’ll need to stick to a temporary table. In some borderline cases try some performance benchmark testing to see which approach offers the best performance. If the resultset is small, the table variable is always the optimum choice.

An Example: Split

Table variables are a superior alternative to using temporary tables in many situations. The ability to use a table variable as the return value of a UDF is one of the best uses of table vari-ables. In the following sample, we will address a common need: a function to parse a delimited string into pieces. In other words, given the string “1,5,9” – we will want to return a table with a record for each value: 1, 5, and 9.

The following user-defined function will walk through an incoming string and parse out the individual entries. The UDF insert the en-tries into a table variable and returns the table variable as a result. As an example, calling the UDF with the following SELECT statement:

SELECT * FROM fn_Split('foo,bar,widget', ',')

will return the following result set.

position value
1 foo
2 bar
3 widget

We could use the resultset in another stored procedure or batch as a table to select against or filter with. We will see why the split function can be useful in the next OdeToCode article. For now, here is the source code to fn_Split

if exists (select * from dbo.sysobjects where id = ob-ject_id(N'[dbo].[fn_Split]') and xtype in (N'FN', N'IF', N'TF'))
drop function [dbo].[fn_Split]
GO

SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS OFF
GO

CREATE FUNCTION fn_Split(@text varchar(8000), @delimiter varchar(20) = ' ')
RETURNS @Strings TABLE
(
position int IDENTITY PRIMARY KEY,
value varchar(8000)
)
AS
BEGIN

DECLARE @index int
SET @index = -1

WHILE (LEN(@text) > 0)
BEGIN
SET @index = CHARINDEX(@delimiter , @text)
IF (@index = 0) AND (LEN(@text) > 0)
BEGIN
INSERT INTO @Strings VALUES (@text)
BREAK
END
IF (@index > 1)
BEGIN
INSERT INTO @Strings VALUES (LEFT(@text, @index - 1))
SET @text = RIGHT(@text, (LEN(@text) - @index))
END
ELSE
SET @text = RIGHT(@text, (LEN(@text) - @index))
END
RETURN
END

GO

SET QUOTED_IDENTIFIER OFF
GO

SET ANSI_NULLS ON
GO

Summary

Next time you find yourself using a temporary table, think of table variables instead. Table variables can offer performance benefits and flexibility when compared to temporary tables, and you can let the server clean up afterwards.

Outer join restrictions

If a table is an inner member of an outer join, it cannot participate in both an outer join clause and a regular join clause. The following query fails because the salesdetail table is part of both the outer join and a regular join clause:

select distinct sales.stor_id, stor_name, title
from sales, stores, titles, salesdetail
where qty > 500
and salesdetail.title_id =* titles.title_id
and sales.stor_id = salesdetail.stor_id
and sales.stor_id = stores.stor_id

Msg 303, Level 16, State 1:
Server 'FUSSY', Line 1:
The table 'salesdetail' is an inner member of an outer-join clause. This is not allowed if the table also participates in a regular join clause.

If you want to know the name of the store that sold more than 500 copies of a book, you would have to use a second query. If you submit a query with an outer join and a qualification on a column from the inner table of the outer join, the results may not be what you expect. The qualification in the query does not restrict the number of rows returned, but rather affects which rows contain the null value. For rows that do not meet the qualification, a null value appears in the inner table's columns of those rows.

Vishal SQL Blog