mySQL: Subqueries Part 1

I’m reading about subqueries and thought I’d try a simple one. It doesn’t make sense to actually do it, but it is a good one to start with. Find all of the customer’s whose state starts with a ‘C’.


SELECT * FROM `customers` 
WHERE state IN (SELECT state FROM ASHA WHERE state LIKE 'C%')

This is actually a very inefficient query, taking 24.1081 seconds. The normal way to do this query,

SELECT * FROM `customers`
WHERE state LIKE 'C%')

takes 0.0008 seconds. So the first lesson I learned is think about whether a subquery is the best way to do what you want. Even though this is a ‘simple’ query, the subquery runs for each row in the outer query. And the subquery queries each row in the table. So you can see that the number of queries that are run is n2.

Here’s one that makes more sense. Each app belongs to a category. The category table is the same one I used in other posts. Not all categories are being used in apps. But each app has a category. I’d like to know how many apps are in each category.


SELECT category.name, categorycount.NumberOfApps 
FROM category 
INNER JOIN
(SELECT category_id, COUNT(*) AS NumberOfApps FROM apps GROUP BY category_id) categorycount
ON categorycount.category_id =  category.id
ORDER BY LOWER(category.name)

This yields the following results in phpMyAdmin:
Category Count

A couple of interesting things about this query. First note that the subquery is labeled with an alias. And all of the column names are preceded by a table name or alias. If you run the subquery by itself, you get a table that has eight rows in it. The inner join just matches the names of the categories with the category ids from the apps table. Since the subquery is able to be independently run, this is what is called a simple query. The ORDER BY statement uses LOWER to put the categories in strict alphabetical order because one of the categories is lower case and it will appear at the end of the table if you don’t use LOWER.

In a previous post I used the favorite words database to do an INNER JOIN. Here I use it to count the number of words in each category. Since the tables are set up similarly to the apps, I can make just a slight modification to the query and it works.


SELECT words_categories.name, categorycount.NumberOfWords
FROM words_categories 
INNER JOIN
(SELECT category_id, COUNT(*) AS NumberOfWords FROM words GROUP BY category_id) categorycount
ON categorycount.category_id =  words_categories.id
ORDER BY LOWER(words_categories.name)

Word Count

In this case, the subquery makes it pretty clear what you are trying to do. But you can also write this as an INNER JOIN without the subquery.


SELECT words_categories.name, COUNT(*) AS NumberOfWords
FROM words_categories 
INNER JOIN
words
ON words.category_id =  words_categories.id
GROUP BY words_categories.name
ORDER BY LOWER(words_categories.name)

The subquery is actually a bit faster, .0002 sec vs .0003 without. This is a small table with around 1200 entries so it doesn’t make any difference.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.