Slice And Dice
Last time out, I showed you how to get your data into a database with the INSERT statement, and edit it with the UPDATE and DELETE statements. And this time, I’m going to show you how to extract specific “slices” of data from your database using a variety of SELECT statements.
Let’s get started!
Christmas Presents
Throughout this article, I’ll be using the database I developed last time to better illustrate the various examples. Just to refresh your memory, here’s what it looked like:
# members table
+-----------+-------+---------+---------+--------------------------+
| member_id | fname | lname | tel | email |
+-----------+-------+---------+---------+--------------------------+
| 1 | John | Doe | 1234567 | [email protected] |
| 2 | Jane | Doe | 8373728 | [email protected] |
| 3 | Steve | Klingon | 7449373 | [email protected] |
| 4 | Santa | Claus | 9999999 | [email protected] |
+-----------+-------+---------+---------+--------------------------+
# videos table
+----------+-------------------------------+------------------+
| video_id | title | director |
+----------+-------------------------------+------------------+
| 1 | Star Wars: The Phantom Menace | George Lucas |
| 2 | ET | Steven Spielberg |
| 3 | Charlie's Angels | McG |
| 4 | Any Given Sunday | Oliver Stone |
| 5 | Hollow Man | Paul Verhoeven |
| 6 | Woman On Top | Fina Torres |
+----------+-------------------------------+------------------+
# status table
+-----------+----------+
| member_id | video_id |
+-----------+----------+
| 2 | 6 |
| 4 | 2 |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
+-----------+----------+
The simplest form of the SELECT query is the “catch-all” query, which returns all the records in a specific table. It looks like this:
mysql> SELECT * FROM members;
+-----------+-------+---------+---------+----------------------------+
| member_id | fname | lname | tel | email |
+-----------+-------+---------+---------+----------------------------+
| 1 | John | Doe | 1234567 | [email protected] |
| 2 | Jane | Doe | 8373728 | [email protected] |
| 3 | Steve | Klingon | 7449373 | [email protected] |
| 4 | Santa | Claus | 9999999 | [email protected] | |
+-----------+-------+---------+---------+----------------------------+
4 rows in set (0.00 sec)
The asterisk (*) indicates that you’d like to see all the columns present in the table. If, instead, you’d prefer to see one or two specific columns only in the result set, you can specify the column name(s) in the SELECT statement, like this:
mysql> SELECT lname FROM members;
+---------+
| lname |
+---------+
| Doe |
| Doe |
| Klingon |
| Claus |
+---------+
4 rows in set (0.00 sec)
You can eliminate duplicate entries using the DISTINCT keyword - the following query will not display members with the last name “Doe” more than once.
mysql> SELECT DISTINCT lname FROM members;
+---------+
| lname |
+---------+
| Doe |
| Klingon |
| Claus |
+---------+
3 rows in set (0.05 sec)
Where, Oh Where, Art Thou?
Of course, the whole idea of structuring data into rows and columns is to make it easier to get a focused result set. And a great part of that focus comes from the WHERE clause (maybe you remember this from the UPDATE and DELETE statements you learnt last time) to the SELECT statement, which allows you to define specific criteria for the result set. Records that do not meet the specified criteria will not appear in the result set.
For example, let’s suppose that you wanted to see a list of all members with the last name “Doe”.
mysql> SELECT * FROM members WHERE lname = "Doe";
+-----------+-------+-------+---------+--------------------+
| member_id | fname | lname | tel | email |
+-----------+-------+-------+---------+--------------------+
| 1 | John | Doe | 1234567 | [email protected] |
| 2 | Jane | Doe | 8373728 | [email protected] |
+-----------+-------+-------+---------+--------------------+
2 rows in set (0.00 sec)
Or let’s suppose that you wanted Santa Claus’ email address:
mysql> SELECT email FROM members WHERE fname = "Santa";
+----------------------------+
| email |
+----------------------------+
| [email protected] | |
+----------------------------+
1 row in set (0.06 sec)
Or even that you wanted to see a list of all movies by George Lucas.
mysql> SELECT title, director FROM videos WHERE director = "George Lucas";
+-------------------------------+--------------+
| title | director |
+-------------------------------+--------------+
| Star Wars: The Phantom Menace | George Lucas |
+-------------------------------+--------------+
1 row in set (0.06 sec)
Yes, I know the collection is incomplete. Maybe I should write to Santa for the other three.
Teacher’s Pet
You can use relational and Boolean operators to modify your SQL query further - this comes in very handy if your table contains a large amount of numeric data, as illustrated below:
# grades table
+-------+------+---------+------------+
| name | math | physics | literature |
+-------+------+---------+------------+
| john | 68 | 37 | 45 |
| jim | 96 | 89 | 92 |
| bill | 65 | 12 | 57 |
| harry | 69 | 25 | 82 |
+-------+------+---------+------------+
The six relational operators available to you in SQL are as follows:
Operator | What It Means |
---|---|
= | is equal to |
!= | is not equal to |
> | is greater than |
< | is less than |
>= | is greater than/equal to |
<= | is less than/equal to |
You can also use the Boolean operators AND, OR and NOT to create more complex queries.
Now, looking at the table above, if you wanted a list of all students who scored over 90 in their math paper, you could formulate a query which looked like this:
mysql> SELECT * FROM grades WHERE math > 90;
+------+------+---------+------------+
| name | math | physics | literature |
+------+------+---------+------------+
| jim | 96 | 89 | 92 |
+------+------+---------+------------+
1 row in set (0.00 sec)
Suppose you wanted to identify the smartest kid in class (you know this guy - he always sits in the front row, answers every question perfectly, and usually has wires on his teeth) so that you could beat him up during break.
mysql> SELECT name FROM grades WHERE math > 85 AND physics > 85 AND
literature > 85;
+------+
| name |
+------+
| jim |
+------+
1 row in set (0.00 sec)
What if you needed to identify the ones who flunked at least one paper?
mysql> SELECT * FROM grades WHERE math <= 25 OR physics <= 25 OR literature
<= 25;
+-------+------+---------+------------+
| name | math | physics | literature |
+-------+------+---------+------------+
| bill | 65 | 12 | 57 |
| harry | 69 | 25 | 82 |
+-------+------+---------+------------+
2 rows in set (0.00 sec)
And finally, you can also perform basic mathematical operations within your query - the next example demonstrates how the three grades can be added together to create a total grade.
mysql> SELECT name, math+physics+literature FROM grades;
+-------+-------------------------+
| name | math+physics+literature |
+-------+-------------------------+
| john | 150 |
| jim | 277 |
| bill | 134 |
| harry | 176 |
+-------+-------------------------+
4 rows in set (0.05 sec)
Obviously, such an operation should only be attempted on fields of the same type.
Reading Backwards
If you’d like to see the data from your table ordered by a specific field, SQL offers the ORDER BY construct. This construct allows you to specify both the column name and the direction in which you would like to see data (ascending or descending).
For example, if you’d like to see data from the “members” table above arranged by id, you could try this:
mysql> SELECT * FROM members ORDER BY member_id;
+-----------+-------+---------+---------+----------------------------+
| member_id | fname | lname | tel | email |
+-----------+-------+---------+---------+----------------------------+
| 1 | John | Doe | 1234567 | [email protected] |
| 2 | Jane | Doe | 8373728 | [email protected] |
| 3 | Steve | Klingon | 7449373 | [email protected] |
| 4 | Santa | Claus | 9999999 | [email protected] | |
+-----------+-------+---------+---------+----------------------------+
4 rows in set (0.06 sec)
And you could reverse the order with
mysql> SELECT * FROM members ORDER BY member_id DESC;
+-----------+-------+---------+---------+----------------------------+
| member_id | fname | lname | tel | email |
+-----------+-------+---------+---------+----------------------------+
| 4 | Santa | Claus | 9999999 | [email protected] | |
| 3 | Steve | Klingon | 7449373 | [email protected] |
| 2 | Jane | Doe | 8373728 | [email protected] |
| 1 | John | Doe | 1234567 | [email protected] |
+-----------+-------+---------+---------+----------------------------+
4 rows in set (0.00 sec)
You can limit the number of records in the result set with the LIMIT keyword - this keyword takes two parameters, which specify the row to start with and the number of rows to display. So the query
SELECT * FROM videos LIMIT 2,2;
would return rows 3 and 4 from the result set.
mysql> SELECT * FROM videos LIMIT 2,2;
+----------+------------------+--------------+
| video_id | title | director |
+----------+------------------+--------------+
| 3 | Charlie's Angels | McG |
| 4 | Any Given Sunday | Oliver Stone |
+----------+------------------+--------------+
2 rows in set (0.00 sec)
You can combine the ORDER BY and LIMIT constructs to quickly get the four newest records in the table - as the following example demonstrates:
mysql> SELECT * FROM videos ORDER BY video_id DESC LIMIT 0, 4;
+----------+------------------+----------------+
| video_id | title | director |
+----------+------------------+----------------+
| 6 | Woman On Top | Fina Torres |
| 5 | Hollow Man | Paul Verhoeven |
| 4 | Any Given Sunday | Oliver Stone |
| 3 | Charlie's Angels | McG |
+----------+------------------+----------------+
4 rows in set (0.00 sec)
Count() Me In
SQL also offers a bunch of built-in functions that come in handy when trying to obtain numeric totals and averages of specific fields. The first of these is the very useful COUNT() function, which counts the number of records in the result set and displays this total.
Consider the following example, which displays the total number of records in the “videos” table:
mysql> SELECT COUNT(*) FROM videos;
+----------+
| COUNT(*) |
+----------+
| 6 |
+----------+
1 row in set (0.00 sec)
This comes in very handy when you need to quickly calculate the total number of records in a table
The SUM() function calculates the sum of the values in the result set, while the AVG() function calculates the average. For example, if you wanted to calculate the average grade in math, physics and literature, you could use a query like this:
mysql> SELECT AVG(math), AVG(physics), AVG(literature) FROM grades;
+-----------+--------------+-----------------+
| AVG(math) | AVG(physics) | AVG(literature) |
+-----------+--------------+-----------------+
| 74.5000 | 40.7500 | 69.0000 |
+-----------+--------------+-----------------+
1 row in set (0.00 sec)
You can identify the smallest and largest value in a specific column with the MIN() and MAX() functions - the following queries display the lowest and highest grade in math respectively.
mysql> SELECT MIN(math) FROM grades;
+-----------+
| MIN(math) |
+-----------+
| 65 |
+-----------+
1 row in set (0.00 sec)
mysql> SELECT MAX(math) FROM grades;
+-----------+
| MAX(math) |
+-----------+
| 96 |
+-----------+
1 row in set (0.00 sec)
Like, You Know, Man…
SQL also offers the LIKE keyword, which is used to return results from a wildcard search and comes in very handy when you’re not sure what you’re looking for. There are two types of wildcards allowed in a LIKE construct: the % character, which is used to signify zero or more occurrences of a character, and the _ character, which is used to signify exactly one occurrence of a character.
Let’s suppose I wanted a list of all members whose first names contained the letter “e”. My query would look like this:
mysql> SELECT * FROM members WHERE fname LIKE '%e%';
+-----------+-------+---------+---------+----------------------+
| member_id | fname | lname | tel | email |
+-----------+-------+---------+---------+----------------------+
| 2 | Jane | Doe | 8373728 | [email protected] |
| 3 | Steve | Klingon | 7449373 | [email protected] |
+-----------+-------+---------+---------+----------------------+
2 rows in set (0.16 sec)
I could use LIKE to generate a list of members whose name begins with the letter “s”
mysql> SELECT * FROM members WHERE fname LIKE 's%';
+-----------+-------+---------+---------+----------------------------
| member_id | fname | lname | tel | email
+-----------+-------+---------+---------+----------------------------
| 3 | Steve | Klingon | 7449373 | [email protected]
| 4 | Santa | Claus | 9999999 | [email protected] |
+-----------+-------+---------+---------+----------------------------
2 rows in set (0.00 sec)
or search through my “videos” collection for movies containing the word segment “man” in their title.
mysql> SELECT title, director FROM videos WHERE title LIKE '%man%';
+--------------+----------------+
| title | director |
+--------------+----------------+
| Hollow Man | Paul Verhoeven |
| Woman On Top | Fina Torres |
+--------------+----------------+
2 rows in set (0.05 sec)
Joining Them Together
So far, all the queries you’ve seen have been concentrated on a single table. But SQL also allows you to query two or more tables at a time, and display a combined result set. This is technically referred to as a “join”, since it involves “joining” different tables at specific points to create new views of the data.
When using a join, it’s recommended that you prefix each column name with the name of the table it belongs to (I haven’t done this in any of the examples you’ve seen so far because all the columns have been localized to a single table.). For example, you would use “members.fname” to refer to the column named “fname” in the table “members”, and “status.video_id” to refer to the “video_id” column in the “status” table.
Here’s an example of a simple join:
mysql> SELECT * FROM status, members WHERE status.member_id =
members.member_id;
+-----------+----------+-----------+-------+-------+---------+-------------- --------------+
| member_id | video_id | member_id | fname | lname | tel | email |
+-----------+----------+-----------+-------+-------+---------+-------------- --------------+
| 1 | 1 | 1 | John | Doe | 1234567 | [email protected] |
| 1 | 2 | 1 | John | Doe | 1234567 | [email protected] |
| 1 | 3 | 1 | John | Doe | 1234567 | [email protected] |
| 2 | 6 | 2 | Jane | Doe | 8373728 | [email protected] |
| 4 | 2 | 4 | Santa | Claus | 9999999 | [email protected] | |
+-----------+----------+-----------+-------+-------+---------+-------------- --------------+
5 rows in set (0.00 sec)
In this case, the “status” and “members” tables have been joined together through the common column “member_id”.
You can specify the columns you’d like to see from the joined tables, as with any SELECT statement:
mysql> SELECT fname, lname, video_id FROM members, status WHERE
members.member_id = status.member_id;
+-------+-------+----------+
| fname | lname | video_id |
+-------+-------+----------+
| Jane | Doe | 6 |
| Santa | Claus | 2 |
| John | Doe | 1 |
| John | Doe | 2 |
| John | Doe | 3 |
+-------+-------+----------+
5 rows in set (0.16 sec)
You can also join three tables together - the following example uses the “status” table, combined with member information and video details, to create a composite table which displays which members have which videos.
mysql> SELECT fname, lname, title FROM members, videos, status WHERE
status.member_id = members.member_id AND status.video_id = videos.video_id;
+-------+-------+-------------------------------+
| fname | lname | title |
+-------+-------+-------------------------------+
| Jane | Doe | Woman On Top |
| Santa | Claus | ET |
| John | Doe | Star Wars: The Phantom Menace |
| John | Doe | ET |
| John | Doe | Charlie's Angels |
+-------+-------+-------------------------------+
5 rows in set (0.17 sec)
Incidentally, if the thought of writing long table names over and over again doesn’t appeal to you, you can assign simple aliases to each table and use these instead. The following example assigns the aliases “m”, “s” and “v” to the “members”, “status” and “videos” tables respectively.
mysql> SELECT m.fname, m.lname, v.title FROM members m, status s, videos v
WHERE s.member_id = m.member_id AND s.video_id = v.video_id;
+-------+-------+-------------------------------+
| fname | lname | title |
+-------+-------+-------------------------------+
| Jane | Doe | Woman On Top |
| Santa | Claus | ET |
| John | Doe | Star Wars: The Phantom Menace |
| John | Doe | ET |
| John | Doe | Charlie's Angels |
+-------+-------+-------------------------------+
5 rows in set (0.00 sec)
Nest Egg
SQL also allows you to nest one query within another, such that the result of the inner query provides data for the outer query. Such a query is referred to as a sub-query, and it allows a great deal of flexibility when formulating long and complex queries.
Let’s suppose you want to find out who’s rented “The Phantom Menace” this weekend. You could of course perform a join, as described on the previous page, and get the answer to your question. Or you could formulate a subquery, which would look like this:
mysql> select fname, lname from members where member_id=(select member_id
from status where video_id=1);
In this case, SQL will first execute the inner query
SELECT member_id FROM status WHERE video_id=1;
+-----------+
| member_id |
+-----------+
| 1 |
+-----------+
and then assign the return value to the outer query, which will display the result.
SELECT fname, lname FROM members WHERE member_id=1;
+-------+-------+
| fname | lname |
+-------+-------+
| John | Doe |
+-------+-------+
1 row in set (0.00 sec)
There is a limit on the number of subqueries you can use in a single SQL statement, but it’s usually quite a comfortable number. Note, however, that mySQL does not currently support SQL subqueries.
And that’s about it. I hope this introduction to SQL helped you get some idea of how to go about creating and using a database, and that you now have a better understanding of the language. Till next time - stay healthy!
Note: All examples in this article have been tested on mySQL 3.22. Examples are illustrative only, and are not meant for a production environment. YMMV!
This article was first published on 17 Jan 2001.