What is the Difference between a Join and Subquery?

Joins and subqueries are both be used to query data from different tables and may even share the same query plan, but there are many differences between them.  Knowing the differences and when to use either a join or subquery to search data from one or more tables is key to mastering SQL. All the ex
Joins and subqueries are both be used to query data from different tables and may even share the same query plan, but there are many differences between them.  Knowing the differences and when to use either a join or subquery to search data from one or more tables is key to mastering SQL.
All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012 database.  You can get started using these free tools using my Guide Getting Started Using SQL Server.

Joins versus Subqueries

Joins and subqueries are both used to combine data from different tables into a single result.  They share many similarities and differences.
Subqueries can be used to return either a scalar (single) value or a row set; whereas, joins are used to return rows.
A common use for a subquery may be to calculate a summary value for use in a query.  For instance we can use a subquery to help us obtain all products have a greater than average product price.
SELECT ProductID,
       Name,
       ListPrice,
       <span style="color: #008000;">(SELECT AVG(ListPrice)</span>
<span style="color: #008000;">          FROM Production.Product)</span> AS AvgListPrice
  FROM Production.Product
 WHERE ListPrice > <span style="color: #0000ff;">(SELECT AVG(ListPrice)</span>
<span style="color: #0000ff;">  FROM Production.Product)</span>

There are two subqueries in this SELECT statement.  The first’s purpose is to display the average list price of all products, the second’s purpose is for filtering out products less than or equal to the average list price.

Here the subquery is returning a single value which is then used filter out products.
Notice how the subqueries are queries unto themselves.  In this example you could paste the subquery, without the parenthesis, into a query window and run it.
Contrast this with a join whose main purpose of a join is to combine rows from one or more tables based on a match condition.  For example we can use a join display product names and models.
Select Product.Name,
       ProductModel.Name as ModelName
FROM   Production.product
       <span style="color: #ff0000;">INNER JOIN Production.ProductModel</span>
<span style="color: #ff0000;">       ON Product.ProductModelID = ProductModel.ProductModelID</span>
In this statement we’re using an INNER JOIN to match rows from both the Product and ProductModel tables.  Notice that the column ProducModel.Name is available for use throughout the query.
The combined row set is then available by the select statement for use to display, filter, or group by the columns.
This is different than the subquery.  There the subquery returns a result, which is immediately used.
Note that he join is an integral part of the select statement.  It can not stand on its own as a subquery can.
A subquery is used to run a separate query from within the main query.  In many cases the returned value is displayed as a column or used in a filter condition such as where or having clause.  When a subquery incorporates a column from the main query it is said to be correlated.  In this way a sub query is somewhat like a join in that values from two or more tables can be compared.
My article Introduction to Subqueries in the SELECT Statement provides a good explanation of correlated subqueries.
Joins are used in the FROM clause of the WHERE statement; however, you’ll find subqueries used in most clauses such as the:
  • SELECT List – here a subqueries used to return single values are used.
  • WHERE clause– depending on the conditional operator you’ll see single value or row based subqueries.
  • FROM clause– It is typical to see row based result subqueries used here.
  • HAVING clause – In my experience scalar (single value) subqueries are used here.
Though joins and subqueries have many differences, they can be used to solve similar problems.  In fact just because you write a SQL statement as a subquery doesn’t mean the DBMS executes as such.
Let’s look at an example.
Suppose the Sales Manager for Adventure Works wants a detailed listing of all sales orders and the number of order details lines for each order.
Surprisingly there are two ways to go about solving this.  We can use a join or a subquery.
Here are the two statements side by side:
join and subquery compared
Side-by-Side Comparison of Join and Subquery
Obviously they look different, but did you know they have very similar query plans?
Here is the query plan for a subquery
query plan for a subquery
Subquery Query Plan
If you look closely you’ll see there is a Merge Join operation.  The subquery is being translated into the same set of operation used for the join.  In fact, if you look at the corresponding joins query plan, you’ll see it is very similar.  You can get more detail about his in my article what is a query plan.
Subqueries and joins can be confusing, but they don’t have to be that way.  I have put together a really great series of videos explaining subqueries and their mysteries.

No comments:

Post a Comment

Genuine websites to earn money.

If you are interested in PTC sites then this article is for you. I have personally tried many of the sites and found that the best thing ...