mySQL: INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN

As mentioned earlier, I’ve been brushing up on my JOIN terminology using data in my mySQL databases. The results here probably apply to other SQL DBMSs as well.

Differences

These three joins result in almost the same rows with minor—crucial—differences. The INNER JOIN returns all of the rows in the first table that have a match in the second table. The order in which the tables is specified doesn’t matter. If there is a row in either table that doesn’t have a match (specified by the join condition) it doesn’t appear in the join.

The LEFT OUTER JOIN returns all of the rows in the first table regardless of whether there is a matching the second table. If there is no match in the second table, the values for those fields are NULL. The RIGHT OUTER JOIN returns all of the rows in the second table regardless of whether there is a matching the first table. If there is no match in the first table, the values for those fields are NULL. As you can see by the specification, the order in which the tables are specified for the outer joins matters.

Examples

In my database the `apps` table has 30 rows. The `category` table has 10 rows. Two of the apps do not have a category assigned to them. Two of the categories have no apps associated with them.

In my database, this code returns 28 rows—the two apps with no assigned category do not appear.


SELECT apps.name, app_category.name 
FROM apps
INNER JOIN app_category 
ON category_id = app_category.id

This code returns 30 rows—the two apps with no assigned category appear in the result and the result and the app_category.name is NULL.


SELECT apps.name, app_category.name 
FROM apps
LEFT OUTER JOIN app_category 
ON category_id = app_category.id

This is a bit more complicated. This code returns 30 rows—the two apps with no assigned category do not appear in the result. The two categories that have no matching apps have an app.name of NULL.


SELECT apps.name, app_category.name 
FROM apps
RIGHT OUTER JOIN app_category 
ON category_id = app_category.id

Inner Joins

I’m re-reading the Visual QuickStart Guide to SQL and working out some examples. I don’t use a whole lot of joins on my websites and I’ve never given the topic much thought. In fact, if you asked me, I would have said that I never use joins, because none of my code uses the word.

It turns out that a lot of the queries on my websites use INNER JOINs, but without using the INNER JOIN keyword. In this example I have a table of products and a table of categories. Each product belongs to one category and the category key is in the product table. In this case the category table has unique values for for `name`.

I frequently use something like this:


SELECT product.name, product_category.name 
FROM product, product_category
WHERE product.category_id = product_category.id

Which is equivalent to this using JOIN syntax.

SELECT product.name, category.name 
FROM product
INNER JOIN product_category AS category
ON product.category_id = category.id

Here’s an example where I combine three tables. The product table has information about the product. It has a unique id for each product. The details table has selling information about the product. This table has a one-to-one mapping to the product table using `product_id`. The category table is the same table as in the example above. It is a one-to-many mapping. Note that the order in which the joins are made doesn’t matter.


SELECT product.name, details.itemNum, details.price, category.name, details.taxable, details.shipping_charge
FROM product, product_details AS details, product_category AS category
WHERE product.id = details.product_id
AND product.category_id = category.id
AND  details.isForSale = 'true'
ORDER BY product.name

Which is equivalent to this using JOIN syntax.

SELECT product.name, details.itemNum, details.price, category.name, details.taxable, details.shipping_charge
FROM product
INNER JOIN product_details AS details
ON product.id = product_id
INNER JOIN product_category AS category
ON product.category_id = category.id
WHERE details.isForSale = 'true'
ORDER BY product.name

Backticks in mySQL

When using phpMyAdmin to work on mySQL databases, you may have noticed that database and table names, as well as fields, are surrounded by backticks. If you do a query, the result may look something like this:


SELECT *  FROM `product` WHERE `name` LIKE 'Articulation'

But you can take them out and it still works.


SELECT *  FROM product WHERE name LIKE 'Articulation'

There is a discussion of this at StackOverflow and the consensus is that it is a non-standard practice that could cause problems if you migrate to another DBMS, but it has advantages if you are sticking with mySQL.

If you use backticks in your code, you can use reserved keywords for table names. You can also use table names with spaces.

“It’s a lot easier to search your code-base for something in backticks. Say you have a table named event. grep -r “event” * might return hundreds of results. grep -r “\`event\`” * will return anything probably referencing your database.” (EllisGL)

If you use them consistently, it does make the code a bit easier to read, but harder to write. Note that if you need more than the table name, you put backticks around the database as well as the table. e.g. `myDatabase`.`myTable` NOT `myDatabase.myTable`

If you are doing a bunch of things in a query it can get a bit messy. Here’s an example with and without.


SELECT `product`.`name`, `category`.`name` 
FROM `product`
INNER JOIN product_category AS category
ON `product`.`category_id` = `category`.`id`

But you can take them out and it still works.


SELECT product.name, category.name 
FROM product
INNER JOIN product_category AS category
ON product.category_id = category.id

You can even take out just some of them and it works fine, however it is probably not good coding practice.


SELECT product.name, `category`.name
FROM `product`
INNER JOIN product_category AS category
ON `product`.`category_id` = `category`.`id`

Note that you still need to use quotes for search terms (like ‘Articulation’ above) and when inserting into non-numeric fields.

I haven’t given this topic any thought before now, but looking at my PHP code, I noticed that I never use backticks in my queries.

Combine two tables to create a mailing list.

I have two MySQL databases on my server for my customers. One has customers who work for universities and one has customers who purchased from our website or made inquiries by phone or email. There is some overlap between the two and for the latest product announcement, I wanted to avoid sending duplicate emails. I usually use a php script to do my queries. If I am sending to just one group, the query is fairly straightforward.



require_once('marketing_database_my.inc');

$fname = "./MarketingExports/email.txt";

    $sql  = "(SELECT name, email FROM Universities ";
    $sql .= "WHERE email IS NOT NULL AND email != '' ";
    $sql .= "AND duplicate_email = 0 ";
    $sql .= "AND remove_email != 1) ";
    
    $result = $dbMKT->query($sql);
    $numRecords = mysqli_num_rows($result);

    echo "There are $numRecords records.";
    // Open the file to save the words
        
    $fp = fopen($fname, 'w');
    if (!$fp) {
        echo "Couldn’t open file.";

    } else {

        echo "Save to file $fname.<br />";
        for ($i = 0; $i < $numRecords; $i++) {
            $row = $result->fetch_array();
            $id = $row["id"];
            $company = trim($row["company"]);
            $name = trim($row["name"]);
            $email = trim($row["email"]);
         
            // Add to the email mail list
            $info = "\"$name\" <$email>";
            fwrite ($fp, "$info" . "\r");
        }
    }
    fclose($fp);

However, if I want to combine the University list and the other customers list, I need to adjust the query a bit by using a UNION


  $sql  = "(SELECT name, email, company FROM Customers ";
  $sql .= "WHERE email IS NOT NULL AND email != '' ";
  $sql .= "AND duplicate_email = 0 ";
  $sql .= "AND remove_email != 1) ";
  $sql .= "UNION ";
  $sql .= "(SELECT Universities.name, Universities.email, Universities.title FROM Universities, Customers ";
  $sql .= "WHERE Universities.email IS NOT NULL AND Universities.email != '' ";
  $sql .= "AND Universities.duplicate_email = 0 ";
  $sql .= "AND Universities.remove_email != 1 ";
  $sql .= "AND Universities.email != Customers.email) ";
  $sql .= "ORDER BY email ";

Two things to note. Name and email are the same in both tables. However, some of my customer records have the customer name as the company and a blank name field. The UNION won’t work if there are a different number of fields in the SELECT statement, so I include the title field for Universities just to make MySQL happy.

I can order by email because it is in both tables. Notice also that I had to tell the second part of the query which table to work on for the WHERE statement.

An example of using Enums

After I posted my deviceType code on Stackoverflow, one of the commenters suggested that I use enums. Now I know what they are because Apple uses them all the time but I didn’t know how useful they were. I did some research on enums and they are nice. They make the code a bit more readable, but mostly they allow the compiler to help you type and catch errors. Xcode will autocomplete your deviceType for you and will give you the error: Use of undeclared identifier if you try to use a value that isn’t defined. I suspect that they are slightly faster in comparisons since they use integers rather than strings. Also, you can use them directly in switch statements. Here’s the code rewritten as an enum. I prefixed the values with LF but you should use whatever is appropriate for your project.

This is in my header file


// Devices as of Fall 2014
typedef NS_ENUM(NSInteger, LFdeviceType) {
    LFDeviceTypePhoneClassic,
    LFDeviceTypePhoneRetina3_5,
    LFDeviceTypePhoneRetina4,
    LFDeviceTypePhone6,
    LFDeviceTypePhone6Plus,
    LFDeviceTypePadClassic,
    LFDeviceTypePadRetina,
};

And this is in my .m file.


+ (NSInteger)deviceType {
    CGSize screenSize = [[UIScreen mainScreen] bounds].size;
    CGFloat deviceScale = [UIScreen mainScreen].scale;
    LFdeviceType device = LFDeviceTypePhoneClassic;

    if (UI_USER_INTERFACE_IDIOM() == UIUserInterfaceIdiomPhone) {
        device = LFDeviceTypePhoneClassic; // Just in case it doesn't make it through the conditionals
        // Classic has a resolution of 480 × 320
        if( (screenSize.height == 480 || screenSize.width == 480) && deviceScale == 1.0f ) {
            device = LFDeviceTypePhoneClassic;
        // Retina has a resolution of 960 × 640
        } else if( (screenSize.height == 480 || screenSize.width == 480) && deviceScale == 2.0f ) {
            device = LFDeviceTypePhoneRetina3_5;
        // Retina 4" has a resolution of 1136 x 640
        } else if (screenSize.height == 568 || screenSize.width == 568 ) {
            device = LFDeviceTypePhoneRetina4;
        // iPhone 6 has a resolution of 1334 by 750
        } else if (screenSize.height == 667 || screenSize.width == 667 ) {
            device = LFDeviceTypePhone6;
        // iPhone 6 Plus has an actual size of 2208 × 1242 and resolution of 1920 by 1080
        // Reported size is 736 x 414
        } else if (screenSize.height == 736 || screenSize.width == 736 ) {
            device = LFDeviceTypePhone6Plus;
        }

    } else if (UI_USER_INTERFACE_IDIOM() == UIUserInterfaceIdiomPad) {
        device = LFDeviceTypePadClassic; // Just in case it doesn't make it through the conditionals
        if(deviceScale == 1.0f) {
            device = LFDeviceTypePadClassic;
        } else if (deviceScale == 2.0f) {
            device = LFDeviceTypePadRetina;
        }
    }
    //NSLog(@"The device is %@ scale is %f and the height is %f and width is %f", device, deviceScale, screenSize.height, screenSize.width);

    return device;
}

Call it like this:

if ( (   [Utilities deviceType] == LFDeviceTypePhoneClassic 
      || [Utilities deviceType] == LFDeviceTypePhoneRetina3_5) &&
        numberOfFoilsOnScreen > 7 ) {
        numberOfFoilsOnScreen = 7;
}