Archives for May 2009

May 5, 2009 - 10 comments

Optimizing MySQL Queries with JOINs

Posted by Jason Lengstorf in Development

DISCLAIMER: I am by no means a MySQL optimization expert, and I can't guarantee that the information provided in this blog entry will take a query from painfully slow to blazing fast. I can, however, guarantee that this entry was written with the best intentions and aims to explore some of the more advanced features of MySQL.

This week, I wanted to share some of my recent experiments with MySQL queries in regard to lowering the number of requests sent to the database server and, hopefully, avoiding the bottleneck that can form on sites that rely heavily on database interaction.

What Are MySQL Joins?

A JOIN in MySQL is a tool that allows developers to combine one or more tables in a query. There are a few flavors of JOIN, but we'll be focusing on LEFT JOIN today.

A LEFT JOIN in MySQL combines rows from two tables. We're able to narrow our searches as usual using a WHERE clause, and we can specify how the tables are joined with a USING clause.

A Simple Example

It's tough to explain JOIN in plain English without getting lost in a bunch of technical jargon (as I'm sure anyone who ventures into the MySQL manual is already aware). It's far easier to provide a concrete example to illustrate the way that LEFT JOIN works.

We'll be using three tables in this example: users, lists, and items. These three tables will be the basis of a theoretical wish list application, with a user's unique ID and name stored in the users table, unique list ID and user ID stored in lists, and individual item IDs, containing list ID, and item text in the items table.

Table: users
+---------+-----------------+
| user_id | user_name       |
+---------+-----------------+
| 1       | Jason Lengstorf |
+---------+-----------------+
| 2       | John Doe        |
+---------+-----------------+

Table: lists
+---------+---------+
| list_id | user_id |
+---------+---------+
| 1       | 2       |
+---------+---------+
| 2       | 1       |
+---------+---------+

Table: items
+---------+---------+----------------------+
| item_id | list_id | item_text            |
+---------+---------+----------------------+
| 1       | 1       | iPod Touch           |
+---------+---------+----------------------+
| 2       | 1       | Chuck Season 2       |
+---------+---------+----------------------+
| 3       | 1       | Nike Free Shoes      |
+---------+---------+----------------------+
| 4       | 2       | F-Stop Maverick Bag  |
+---------+---------+----------------------+
| 5       | 2       | Bon Iver Tickets     |
+---------+---------+----------------------+
| 6       | 2       | Rogue Mocha Porter   |
+---------+---------+----------------------+

Imagine we've been supplied with a user's ID and need to retrieve the items on his wish list. Without using JOIN, we would need two MySQL queries: one to retrieve the list ID associated with the user, and one to retrieve all the items associated with that list ID.

Example 1 (Not Using JOIN)

<?php
	// Establish a MySQL connection
	$link = new PDO(DBINFO, DBUSER, DBPASS);

	// Retrieve the list ID associated with the user ID
	$sql = "SELECT list_id
			FROM lists
			WHERE user_id = ?
			LIMIT 1";
	$stmt = $link->prepare($sql); // Prepare the statement
	if($stmt->execute(array($_POST['user_id'])))
	{
		$row = $stmt->fetch(); // Load the results in an array
		$list_id = $row['list_id'];
		$stmt->closeCursor(); // Free memory used in this query

		// Retrieve the items associated with the list ID
		$sql = "SELECT item_text
				FROM items
				WHERE list_id = $list_id";
		foreach($link->query($sql) as $row) {
			// Output the wish list items
			echo $row['item_text'], "<br />";
		}
	}
?>

Assuming the user has supplied the user id of 1, we would see the following output:

F-Stop Maverick Bag
Bon Iver Tickets
Rogue Mocha Porter

This returns the desired result, but sending two queries just feels inefficient. Fortunately, we can shorten this script to just one query using a JOIN.

Example 1 (Using JOIN)

<?php
	// Establish a MySQL connection
	$link = new PDO(DBINFO, DBUSER, DBPASS);

	// Retrieve the items associated with the user ID
	$sql = "SELECT item_text
			FROM lists
			LEFT JOIN items
			USING (list_id)
			WHERE user_id = ?";
	$stmt = $link->prepare($sql); // Prepare the statement
	if($stmt->execute(array($_POST['user_id'])))
	{

		// Loop through the returned results
		while($row = $stmt->fetch()) {
			// Output the wish list items
			echo $row['item_text'], "<br />";
		}

		$stmt->closeCursor(); // Free memory used in this query
	}
?>

Using a LEFT JOIN, we were able to combine the lists and items tables, grouping entries by their list IDs. Essentially, by using a JOIN, we're temporarily creating a new table that looks like this:

SELECT * FROM lists LEFT JOIN items USING (list_id)
+---------+---------+---------+---------------------+
| list_id | user_id | item_id | item_text           |
+---------+---------+---------+---------------------+
| 1       | 2       | 1       | iPod Touch          |
+---------+---------+---------+---------------------+
| 1       | 2       | 2       | Chuck Season 2      |
+---------+---------+---------+---------------------+
| 1       | 2       | 3       | Nike Free Shoes     |
+---------+---------+---------+---------------------+
| 2       | 1       | 4       | F-Stop Maverick Bag |
+---------+---------+---------+---------------------+
| 2       | 1       | 5       | Bon Iver Tickets    |
+---------+---------+---------+---------------------+
| 2       | 1       | 6       | Rogue Mocha Porter  |
+---------+---------+---------+---------------------+

With all the information available from both tables, we're able to use our user's ID to directly select wish list items.

Going a Step Further

Our wish list isn't very useful if it only shows a list of products. In order to make any sense, we should probably show the user's name to indicate who it is that wants the displayed items.

This presents a problem for us, however, because we now need to pull information from all three tables in order to gather all the necessary data.

Example 2 — Combining Three Tables Using LEFT JOIN

<?php
	// Establish a MySQL connection
	$link = new PDO(DBINFO, DBUSER, DBPASS);

	// Retrieve the user's name and associated list items
	$sql = "SELECT user_name, item_text
			FROM users
			LEFT JOIN lists
			USING (user_id)
			LEFT JOIN items
			USING (list_id)
			WHERE user_id = ?";
	$stmt = $link->prepare($sql); // Prepare the statement
	if($stmt->execute(array($_POST['user_id'])))
	{

		// Loop through the returned results
		while($row = $stmt->fetch()) {

			// Save the user name
			$user_name = $row['user_name'];

			// Create an array of items
			$items[] = $row['item_text'];
		}

		$stmt->closeCursor(); // Free memory used in this query

		// Output the user's name and identify what we're displaying
		echo "$user_name's Wish List<br /><br />n";

		// Loop through the items and output to the browser
		foreach($items as $item) {
			echo $item, "<br />";
		}
	}
?>

With the above query, we effectively retrieve the user's name from the users table, then pull the list ID that matches the user's ID, and finally retrieve all items that match the list ID.

The above script will output the following, assuming user ID 1 was passed:

Jason Lengstorf's Wish List

F-Stop Maverick Bag
Bon Iver Tickets
Rogue Mocha Porter

By combining the queries using LEFT JOIN, we create a much easier to read snippet of code. Also, by reducing the number of queries run, we can hope to see faster execution in our scripts.

Summary

Like I said, I'm not a MySQL optimization expert, but from what I've read and my limited testing, it seems pretty safe to say that using JOIN to reduce the number of queries in our scripts is a good way to not only enhance the readability of our scripts, but also to grease them up for faster execution.

Do you know any tricks for speeding up MySQL queries? How do you feel about using JOIN in queries? Let me know in the comments!