Problem
If you work on SQL Server instances with multiple databases, there may be times when you need to run the same query or SQL statement across all databases.
I recently developed an ETL process for one of my clients; this involved writing a stored procedure to extract data from production OLTP source tables and insert the data into a reporting database.
The client has over 10 databases, which are dedicated for specific geographic locations. The extraction query needed to be executed against every source OLTP database and inserted into the destination table in the reporting database.
The main solution that first popped into my mind was writing a cursor. However, I wanted something more efficient that could more efficiently handle large record counts. After researching other alternatives, initially I decided to utilize sp_MSforeachdb; this is an undocumented stored procedure which allows you to execute a SQL command against all databases. However, below are some of the common issues with sp_MSforeachdb.
- Skipping databases
- Failing with problematic identifiers
- Random invalid column errors
- Data type conversion errors
The issues above are why the stored procedure is undocumented and unsupported by Microsoft. I specifically ran into the “random invalid column name” error when excluding system databases.
Solution
I decided to write a cursor to loop through the databases and perform the extraction. Now before write a massive comment about how cursors “are a nightmare”, I am aware of all the potential implications. The databases are small in size and the extracts are pulling low amounts of records as well. Long term I may attempt to convert this to a WHILE loop, but the cursor is my current best solution for the project.
First, I needed to create a table containing the data sources. I’m utilizing the cursor to loop through each database (data source) and execute a SQL statement. This is a simple two column table containing the fully qualified table name [database].[schema].[table] and the [schema].[table] name.
The cursor below uses @@FETCH_STATUS to control cursor activities in a WHILE loop. It loops through the source table that I created and executes the insert/select statement for each data source. The cursor allows me to insert data into the destination table sequentially by selecting from each database on the instance, excluding system databases and the data warehouse database. After the cursor has looped through the entire result set, the loop exists.
Conclusion
For the purpose of this ETL process and the amount of records involved, the cursor is a great solution. However, it’s important to keep in mind that cursors are not always the right solution. There’s often a more efficient way than using a cursor, such as a WHILE loop.
In this specific ETL use case, using a cursor was the best solution that I could come up with. If there’s a better solution that you know of other than sp_MSforeachdb, please leave a comment on this post.
YouTube Channel Hits 39,000 Subs!
To be alerted when I upload new YouTube content, subscribe to the channel and turn on all notifications. I upload freelancing, personal finance, and tech content weekly. Drop a comment below with any questions, feedback, or new videos that you would like to see on my channel. Thanks for stopping by and I will see you in the next post!
Leave A Comment