SQL Intermediate 2: UNION Operator
SQL
Join us for our second SQL Intermediate series post as we explore the incredible power of SQL's UNION operator! Discover how this powerful tool allows you to effortlessly combine data from multiple tables or queries, simplifying data retrieval and creating unified outputs. Learn the key differences between UNION and UNION ALL, and when to use each one. Don't miss this opportunity to optimize your data manipulation skills and take your SQL knowledge to the next level!
Hello everyone and thank you for joining us on our second SQL Intermediate series post.
If you missed the first SQL Intermediate post:
SQL Intermediate 1: INNER and OUTER Joins
NOTE: For today’s exercise, we are using our ‘EmployeeDemographics’ table. However, let’s create a new table (‘EmployeeINFO’) with the same column names to make this activity easier.
If you don’t know how to create a table or need a cheat sheet, just go to:
SQL Basics 1: How to Create Table and Add Data in SSMS
Values for ‘EmployeeINFO’ table, if you want to copy and paste:
(1013, 'Cris', 'Got', 47, 'Female'),
(1014, 'Carlos', 'Gomez', 52, 'Male'),
(1040, 'John', 'Rey', 50, 'Male'),
(1041, 'Carol', 'Wood', 50, 'Female')
Feel free to create your own data. If you create duplicates between tables, you will get a better idea of UNION ALL.
The reason I’m talking about UNION right after JOIN is because they are similar by combining two tables to create one output.
The difference?
JOIN combines both tables based off a common column. When we do that, they are shown in different columns.
The output is both tables by side, separated. We cannot get the information in the same column.
It is when UNION comes into play!
With UNION, we can select all the data from both tables and put it into one output where all the data is in each column instead of separated.
Complicated?
Let’s move on and hopefully things get clear!
In the world of SQL (Structured Query Language), there are several powerful tools that help us manipulate and retrieve data from databases. Two essential operations for combining data from multiple tables or queries are the UNION and UNION ALL commands. These commands are frequently used in SQL to consolidate information and simplify data retrieval. In this blog post, we will explore the differences between UNION and UNION ALL and when to use each one.
UNION Operator in SQL:
The UNION operator in SQL is used to combine the results of two or more SELECT queries into a single result set.
This operator performs a distinct operation, meaning it removes duplicate rows from the final output. When you use UNION, the result set will only contain unique rows, eliminating any duplicates that may exist between the combined queries.
Union operator syntax:
SELECT: This keyword is used to specify the columns that you want to retrieve from the tables or queries. It is followed by a comma-separated list of column names, or you can use * to select all columns.
column1, column2: These represent the specific columns you want to retrieve data from. Replace them with the actual column names you need.
FROM: This keyword is used to specify the tables from which you want to retrieve data or the queries you want to combine.
table1, table2: These are the names of the tables or the result of queries from which you want to retrieve data. Replace them with the actual table names or queries you want to use.
UNION: This keyword combines the results of the two SELECT queries above it into a single result set, eliminating duplicates.
Example: Let's combine our 2 tables ‘EmployeeDemographics’ and ‘EmployeeINFO’.
“SELECT * FROM EmployeeDemographics UNION SELECT FROM EmployeeINFO” -> Execute
Now, the output is a table where values of ‘EmployeeINFO’ was added down below instead of a separated table on the side. UNION also works like a DISTINCT statement by taking out and removing duplicates.
UNION ALL Operator in SQL:
Similar to UNION, the UNION ALL operator also combines the results of multiple SELECT queries.
However, unlike UNION, UNION ALL does not perform a distinct operation. It includes all rows from the combined queries, even if there are duplicates between them. This means that the final result set may contain duplicate rows if they exist in the individual queries being combined.
In this case, since we didn’t have duplicates, the output is the same.
Key Differences Between UNION and UNION ALL Operatos in SQL:
The main differences between UNION and UNION ALL can be summarized as follows:
Duplicate Rows:
UNION: Removes duplicate rows from the final result set.
UNION ALL: Includes all rows from the combined queries, even if they are duplicates.
Performance:
UNION: Due to the extra step of removing duplicates, it may have slightly higher overhead than UNION ALL.
UNION ALL: Generally faster than UNION because it doesn't need to perform the distinct operation.
When to Use UNION and UNION ALL Operators in SQL:
Use UNION when you need to combine result sets and want to remove duplicate rows. For instance, when you want to merge tables with unique records only.
Use UNION ALL when you need to combine result sets and do not want to eliminate duplicate rows. This is useful when you want to merge tables and preserve all rows, even if they might have duplicates.
Let’s spice up a little using our ‘EmployeeSalary’ table that columns names are different, to see the result.
For this case, we have to specify the columns:
“SELECT EmployeeID, FirstName, Age FROM EmployeeDemographics UNION SELECT EmployeeID, JobTitle, Salary FROM EmployeeSalary” -> Execute
It works because:
Same amount of columns
The output is similar as FirstName and JobTitle are text and Age and Salary are integers.
HINT: We don’t really want to do that because, as we see the outcome table, FirstName column is mixing first name. With job title, and Age column is mixing age and salary results. Be careful to not mix your data.
In summary, both UNION and UNION ALL are valuable tools in SQL that allow us to combine data from multiple queries. Understanding their differences is crucial in deciding which one to use based on the specific requirements of the task at hand. Whether you need to eliminate duplicates or retain all rows, UNION and UNION ALL will help you efficiently consolidate and manage your data.
If you like this content and want to learn more, you might also read:
SQL Fundamentals for Data Analysis: A Hands-On Learning Series
SQL Basics 1: How to Create Table and Add Data in SSMS
SQL Basics 2: SELECT + FROM Statements
SQL Basics 4: GROUP BY and ORDER BY Statements
SQL Intermediate 1: INNER and OUTER Joins
SQL Intermediate 2: UNION Operator
SQL Intermediate 3: CASE Statement
SQL Intermediate 4: HAVING Clause
SQL Intermediate 5: UPDATE and DELETE Data
SQL Intermediate 6: AS - Aliasing
SQL Intermediate 7: PARTITION BY Clause
SQL Advanced 1: CTEs - Common Table Expressions
SQL Advanced 3: Stored Procedures
SQL Advanced 4: SUBQUERIES - In Select, From and Where
SQL Advanced 5: String Functions - TRIM, LTRIM, RTRIM, REPLACE, SUBSTRING, UPPER, and LOWER