How to Compare Two Tables in Sql Using Python
By: | Updated: 2021-10-21 | Comments (24) | Related: 1 | 2 | 3 | 4 | More > Comparison Data and Objects
Problem
Sometimes we need to compare SQL Server tables and/or data to know what has changed. This article shows different ways to compare data, datatypes and table structures when using SQL Server.
Solution
I will cover different methods to identify changes by using various SQL queries as well as a couple development tools.
Let's say we have two similar tables in different databases and we want to know what is different. Here is a script that creates sample databases, tables and data.
CREATE DATABASE dbtest01 GO USE dbtest01 GO CREATE TABLE [dbo].[article] ([id] [nchar](10) NOT NULL, [type] [nchar](10) NULL, [cost] [nchar](10) NULL, CONSTRAINT [PK_article] PRIMARY KEY CLUSTERED ( [id] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO INSERT INTO [dbo].[article] VALUES ('001', '1', '40'), ('002', '2', '80'), ('003', '3', '120') GO CREATE DATABASE dbtest02 GO USE dbtest02 GO CREATE TABLE [dbo].[article] ([id] [nchar](10) NOT NULL, [type] [nchar](10) NULL, [cost] [nchar](10) NULL, CONSTRAINT [PK_article] PRIMARY KEY CLUSTERED ( [id] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO INSERT INTO [dbo].[article] VALUES ('001', '1', '40'), ('002', '2', '80'), ('003', '3', '120'), ('004', '4', '160') GO
The T-SQL code generates 2 tables in different databases. The table names are the same, but the table in database dbtest02 contains an extra row as shown below:
Let's look at ways we can compare these tables using different methods.
Compare SQL Server Data in Tables Using a LEFT JOIN
With a LEFT JOIN we can compare values of specific columns that are not common between two tables.
For example, with this SELECT statement:
SELECT * FROM dbtest02.dbo.article d2 LEFT JOIN dbtest01.dbo.article d1 ON d2.id = d1.id
The result set from the LEFT JOIN shows all rows from the left table "dbtest02.dbo.article", even if there are no matches in table "dbtest01.dbo.article":
In this example, we are comparing 2 tables and the NULL are displayed if there are no matching rows. This method works to verify new rows, but if we update other columns, the LEFT JOIN does not help.
This can be done both ways to see if there are differences the other way around. This SQL statement will just return the 3 matching rows.
SELECT * FROM dbtest01.dbo.article d1 LEFT JOIN dbtest02.dbo.article d2 ON d1.id = d2.id
SQL Server Data Comparison in Tables Using the EXCEPT Clause
Except shows the difference between two tables (the Oracle DBMS guys use minus instead of except and the syntax and use is the same). It is used to compare the differences between two tables. For example, let's see the differences between the two tables:
Now let's run a query using except:
SELECT * FROM dbtest02.dbo.article EXCEPT SELECT * FROM dbtest01.dbo.article
The except returns the difference between the tables from dbtest02 and dbtest01:
If we flip the tables around in the query we won't see any records, because the table in database dbtest02 has all of the records plus one extra.
SELECT * FROM dbtest01.dbo.article EXCEPT SELECT * FROM dbtest02.dbo.article
This method is better than the first one, because if we change values for other columns like the type and cost, EXCEPT will notice the difference.
Here is an example if we update id "001" in database dbtest01 and change the cost from "40" to "1". If we update the records and then run the query again we will see these differences now:
Unfortunately it does not create a script to synchronize the tables.
Compare SQL Server Data in Tables Using the Tablediff Tool
There is a free command line tool used to compare tables. This can be found in "C:\Program Files\Microsoft SQL Server\110\COM\" folder. This command line tool is used to compare tables. It also generates a script with the INSERT, UPDATE and DELETE statements to synchronize the tables. For more details, refer to this table diff article.
Compare SQL Server Data in Tables Using Change Data Capture (CDC)
This feature is available in SQL Server 2008 and later. You need to enable this feature and you also need to have SQL Server Agent running. Basically it creates system tables that track the changes in your tables that you want to monitor. It does not compare tables, but it tracks the changes in tables.
For more information, refer to these Change Data Capture (CDC) tips.
Compare SQL Server Data Types Between Two Tables
What happen if we want to compare the data types? Is there a way to compare the datatypes?
Yes, we can use the [INFORMATION_SCHEMA].[COLUMNS] system views to verify and compare the information. We are going to create a new table named dbo.article2 with a column with different data type than the dbo.article table:
USE dbtest01 GO CREATE TABLE [dbo].[article2]( [id] [int] NOT NULL, [type] nchar(10) NULL, [cost] nchar(10) NULL, CONSTRAINT [PK_article1] PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO
The difference is that the id is now an int instead of nchar(10) like the other tables.
The query to compare data types of tables article and article1 would be:
USE dbtest01 GO SELECT c1.table_name, c1.COLUMN_NAME, c1.DATA_TYPE, c2.table_name, c2.DATA_TYPE, c2.COLUMN_NAME FROM [INFORMATION_SCHEMA].[COLUMNS] c1 LEFT JOIN [INFORMATION_SCHEMA].[COLUMNS] c2 ON c1.COLUMN_NAME = c2.COLUMN_NAME WHERE c1.TABLE_NAME = 'article' AND c2.TABLE_NAME = 'article2' AND c1.data_type <> c2.DATA_TYPE
The results are as follows:
The query compares the data types from these two tables. All the information of the columns can be obtained from the [INFORMATION_SCHEMA].[COLUMNS] system view. We are comparing the table "article" with table "article2" and showing if any of the datatypes are different.
Compare if there are Extra Columns Between SQL Server Database Tables
Sometimes we need to make sure that two tables contain the same number of columns. To illustrate this we are going to create a table named "article3" with 2 extra columns named extra1 and extra2:
USE dbtest01 GO CREATE TABLE [dbo].[article3]( [id] [int] NOT NULL, [type] nchar(10) NULL, [cost] nchar(10) NULL, extra1 int, extra2 int )
In order to compare the columns I will use this query:
USE dbtest01 GO SELECT c2.table_name, c2.COLUMN_NAME FROM [INFORMATION_SCHEMA].[COLUMNS] c2 WHERE table_name = 'article3' AND c2.COLUMN_NAME NOT IN ( SELECT column_name FROM [INFORMATION_SCHEMA].[COLUMNS] WHERE table_name = 'article' )
The query compares the different columns between table "article" and "article3". The different columns are extra1 and extra2. This is the result of the query:
Compare SQL Server Tables in Different Databases
Now let's compare the tables in database dbtest01 and dbtest02 with UNION logic (Learn the difference between UNION and UNION ALL) and a subquery with NOT IN logic. The following query will show the different tables in dbtest01 compared with dbtest02:
SELECT 'dbtest01' AS dbname, t1.table_name FROM dbtest01.[INFORMATION_SCHEMA].[tables] t1 WHERE table_name NOT IN ( SELECT t2.table_name FROM dbtest02.[INFORMATION_SCHEMA].[tables] t2 ) UNION SELECT 'dbtest02' AS dbname, t1.table_name FROM dbtest02.[INFORMATION_SCHEMA].[tables] t1 WHERE table_name NOT IN ( SELECT t2.table_name FROM dbtest01.[INFORMATION_SCHEMA].[tables] t2 )
Compare schemas using SSDT
SQL Server Data Tools allows to compare schemas of two different tables using the SQL Server Database Project. It can generate scripts that will allow you to synchronize data with some few clicks. Let's take a look to it.
1 - In the database project, go to the Solution Explorer and right click on the database and select the Schema Compare option to compare the tables:
2 - In the Select Target Schema, press the Select Connection to select the table destination to be compared with the table in the source. You will be able to select an existing connection or create a new one:
3 - Next, press the compare button and it will show the difference. It will show the tables to add or the tables to remove:
4 - The tool will show you the entire T-SQL script that you can apply or modify according to your requirements in the tables:
SSIS Lookup option
This option is really popular in ETLs (Extraction Transformation Loads). The Lookup is an SSIS transformation task that allows to lookup data using joins using a dataset. The lookup allows you to detect changed data between 2 tables. In the following example, I will compare the tables.
1 - In the SSIS project, double click the Data Flow Task to create it in the Control Flow in order to create a Data Flow sequence:
2 - In the Data Flow, drag and drop the OLE DB Source with the source table to compare, the Lookup task and 2 OLE DB Destination tasks:
3 - In the Lookup you can fully store the information in cache or connect partially. We also have the no cache option. The full cache mode option will store the data in the SSIS cache. The partial cache will only store on cache values as each distinct value is found in the data flow. Each distinct value will be obtained from the specific source:
4 - The task will allow you to define the keys to compare and the lookup columns with the output alias used:
As you can see, the Lookup allows to lookup and compare data. You can then get what data was changed between 2 tables.
Third Party Tools
There are great tools to compare tables including data and schemas. You can use Visual Studio or use other SQL Server Comparison tools.
Next Steps
- There are multiple tools and ways to compare data and schemas. Select the method that works best for your needs based on the above queries.
- For more information refer to these links:
- Compare SQL Server Datasets with INTERSECT and EXCEPT
- SQL Server Left Join
- SQL Server Change Data Capture
- SQL Server Tablediff Utility
- Using MERGE in SQL Server to insert, update and delete at the same time
- SQL Server Cursor Example
Related Articles
Popular Articles
About the author
Daniel Calbimonte is a Microsoft SQL Server MVP, Microsoft Certified Trainer and Microsoft Certified IT Professional.
View all my tips
Article Last Updated: 2021-10-21
How to Compare Two Tables in Sql Using Python
Source: https://www.mssqltips.com/sqlservertip/2779/ways-to-compare-and-find-differences-for-sql-server-tables-and-data/