We are testing interview quizzes and created a demo app. Now, We require your help. Please take this quiz and provide inputs for content improvement. Interview Quiz




SQL:Target a SubQuery

SQL-TablesSQL-Tables

People ask me a lot about subquery so I thought let’s answer with a article touching a bit of everything about subquery.

Before discussing SubQuery, there are few things which we should know like:

What is a Query?

I will go with its English Definition first. As a verb, “Put a question or questions to”.

Technical definition w.r.t. SQL “These are the commands issued to a database for retrieval of required information.” If we see closely,both the definitions complement each other. For example:

select all data the in employees table.


select * from employees

Lets put a condition like- select all employees with salary>5000


select * from employee where salary>5000

What is a SubQuery?

SubQuery can be treated as a ‘query on query’. A subquery is the inner query which provides a targeted result to the outer main query . We can try few examples to learn it

Example:

select employee name with its manager name


select emp.name,(select mgr.name from employee AS mgr
where emp.mgrid=mgr.empid) from employee AS emp

We mostly see subqueries in where clause like – select employees having average salary


select * from employee where salary=(select AVG(salary) from employee)

What is Correlated SubQuery?

A correlated sub-query  is a sub-query that uses values from the outer query in its WHERE clause.Let’s try with an example

select employees having salary greater than average salary of employees of department ‘IT’


select * from employee where salary=(select AVG(salary) from employee
where department=’IT’)

The main difference is that the subquery will be executed for each row before the result can be used by outer query.

Why do we require SubQuery or advantages of SubQuery?

  1. SubQuery holds the results like a temporary table which can be used by outer query.
  2. SubQuery are easier to understand
  3. SubQuery breaks down a complex query into small and simple queries.
  4. SubQuery are easy to use as a replacement of joins.There is no major difference in performance.

SubQuery Rules

A subquery is subject to the following restrictions:

  • Up to 32 levels of nesting is possible, although the limit varies based on available memory and the complexity of other expressions in the query
  • If a table appears only in a subquery and not in the outer query, then columns from that table cannot be included in the output
  • The select list of a subquery introduced with a comparison operator can include only one expression or column name (except that EXISTS and IN operate on SELECT * or a list, respectively).
  • If the WHERE clause of an outer query includes a column name, it must be join-compatible with the column in the subquery select list.
  • The ntext, text, and image data types cannot be used in the select list of subqueries.
  • Because they must return a single value, subqueries introduced by an unmodified comparison operator (one not followed by the keyword ANY or ALL) cannot include GROUP BY and HAVING clauses.
  • The DISTINCT keyword cannot be used with subqueries that include GROUP BY.
  • The COMPUTE and INTO clauses cannot be specified.
  • ORDER BY can only be specified when TOP is also specified.
  • A view created by using a subquery cannot be updated.
  • The select list of a subquery introduced with EXISTS, by convention, has an asterisk (*) instead of a single column name. The rules for a subquery introduced with EXISTS are the same as those for a standard select list, because a subquery introduced with EXISTS creates an existence test and returns TRUE or FALSE, instead of data.

ref:http://msdn.microsoft.com/en-us/library/ms189543(v=sql.105).aspx

Join Vs SubQueries

I was looking for this answer and though it’s not a verified answer but yes,it’s true in most cases. refer this :http://stackoverflow.com/questions/2577174/join-vs-subquery

In most cases JOINs are faster than sub-queries and it is very rare for a sub-query to be faster.

In JOINs RDBMS can create an execution plan that is better for your query and can predict what data should be loaded to be processed and save time, unlike the sub-query where it will run all the queries and load all their data to do the processing.

The good thing in sub-queries is that they are more readable than JOINs: that’s why most new SQL people prefer them; it is the easy way; but when it comes to performance, JOINS are better in most cases even though they are not hard to read too.

Conclusion

I am not building something new but assembled all the questions and answers related to subquery which keeps on bugging me day and night. I hope this will be useful for people looking for answers at one place.please mail me at admin@codespread.com

Navigation:





6 Responses to SQL:Target a SubQuery

  1. Rajeev says:

    Nice n informative…

  2. Rajeev says:

    Nice and informative…

  3. Rajeev says:

    Can you give some information on “Division operation” under Relational Algebra.

  4. > In most cases JOINs are faster than sub-queries and it is very rare for a sub-query to be faster.
    I don’t know about MySQL but for SQL Server, that is false. Here is an example where a subquery is faster than JOINs. The objective of the query is to retrieve users who have made made at least a comment.

    – fastest
    SELECT [U].[Username] FROM [User] [U]
    WHERE EXISTS ( SELECT 1 FROM [Comment] [C] WHERE [C].[ID] = [U].[ID] )

    vs

    – normal
    SELECT [U].[Username] FROM [User] [U]
    INNER JOIN [Comment] [C] ON [U].[ID] = [C].[ID]

    vs

    – slowest (and dumb)
    SELECT [Username] FROM [User]
    WHERE [ID] IN ( SELECT [UserID] FROM [Comment] )

    • Admin says:

      Hi Michael, These recommendations are provided by experts and their reference is also mentioned in the article.Still, for the sake of learning, We can test these conditions through many tools. But Before that,we can do an exercise, Could you please share the order in which the component of a query fires? With the order itself, we can deduce which runs faster. Later, we can take help of a tool.