Jump to content
Larry Ullman's Book Forums

Q About Multiple Inner Joins


Recommended Posts

I will admit that one of my weaker points is comprehending SQL queries, especially ones with multiple inner joins (as is the case with a lot of the queries for the second example site in the book!). With that said, I'm trying to get better, but I want to confirm some things.

 

Let's use the following query as an example (please refer to the book for details on the database structure, etc.). This example comes from page 196:

 

SELECT gc.description, gc.image, CONCAT("C",sc.id) AS sku,
CONCAT_WS(" - ",s.size,sc.caf_decaf,sc.ground_whole,sc.price) AS name,
sc.stock
FROM specific_coffees AS sc INNER JOIN sizes AS s ON s.id=sc.size_id
INNER JOIN general_coffees AS gc ON gc.id=sc.general_coffee_id
WHERE general_coffee_id=<some_category_id> AND stock>0
ORDER by name ASC;

 

My first question is, with inner joins, how important is the order? Does the order matter at all? If so, why? Sorry if that's too broad a question, but at the moment, I am having trouble imagining the whole situation.

 

Next, the first inner join is the following:

 

FROM specific_coffees AS sc INNER JOIN sizes AS s ON s.id=sc.size_id

 

So this is going to find all instances where id in the sizes table is equal to size_id in the specific_coffees table, and then pull the appropriate data for those rows, right? Is my thinking correct here?

 

Okay. So here's where it gets interesting. Is the second inner join acting off the results of the first inner join? In other words, the first inner join is performed, and then among those (filtered) results, we are running a second inner join that is further filtering the results.

 

Or perhaps, are the two inner joins independent, and the results from both are being concatenated together? This is where my understanding is weak. I tend to believe that from a logical standpoint, the former is more likely, but I really don't know.

 

Could someone please clarify this point?

 

In the end, I suppose I want to clarify whether the following two inner join statements would produce the same results or not:

 

FROM specific_coffees AS sc INNER JOIN sizes AS s ON s.id=sc.size_id
INNER JOIN general_coffees AS gc ON gc.id=sc.general_coffee_id

FROM specific_coffees AS sc INNER JOIN general_coffees AS gc ON gc.id=sc.general_coffee_id
INNER JOIN sizes AS s ON s.id=sc.size_id

 

And lastly, once the two joins have been performed, then the results are limited one more time by the WHERE clause, right? In essense, after the joins, we only want the results related to one type of coffee and for coffees in stock, right?

 

Sorry for the long post, but any help is appreciated.

 

As a random side note, the book uses the notation "ORDER by" at the end of the query, but I've always used "ORDER BY". I suppose it doesn't matter though.

Link to comment
Share on other sites

My first question is, with inner joins, how important is the order? Does the order matter at all? If so, why? Sorry if that's too broad a question, but at the moment, I am having trouble imagining the whole situation.

 

I'm not 100% sure of this. It matters in how the data is presented to you in SQL. It does not matter with PHP, as you specify which columns you would like to display, and how you would like to display them.

 

FROM specific_coffees AS sc INNER JOIN sizes AS s ON s.id=sc.size_id

 

So this is going to find all instances where id in the sizes table is equal to size_id in the specific_coffees table, and then pull the appropriate data for those rows, right? Is my thinking correct here?

 

Yes, you are absolutely right.

 

Okay. So here's where it gets interesting. Is the second inner join acting off the results of the first inner join? In other words, the first inner join is performed, and then among those (filtered) results, we are running a second inner join that is further filtering the results.

 

Inner joins does not filter data. It connects data. That is how you would want to look at it. Regarding your question: Think of the columns vertically. Another Inner Join will add columns to the right of the column that you are joining on. Coffee_id: 13 / Size: 0.5 (kilo) / Height: 17 (cm) / etc. Are you following?

 

Or perhaps, are the two inner joins independent, and the results from both are being concatenated together? This is where my understanding is weak. I tend to believe that from a logical standpoint, the former is more likely, but I really don't know..

 

And I did not read this before I answered your last quote. English is not my primary language, but this sounds more like the correct explanation of what happens. Arrest me if I'm wrong.

 

In the end, I suppose I want to clarify whether the following two inner join statements would produce the same results or not:

 

FROM specific_coffees AS sc INNER JOIN sizes AS s ON s.id=sc.size_id
INNER JOIN general_coffees AS gc ON gc.id=sc.general_coffee_id

FROM specific_coffees AS sc INNER JOIN general_coffees AS gc ON gc.id=sc.general_coffee_id
INNER JOIN sizes AS s ON s.id=sc.size_id

As stated earlier, I think this will produce the same result. I cannot absolutely guarantee it, but why not do a test in phpMyAdmin or something? :)

 

And lastly, once the two joins have been performed, then the results are limited one more time by the WHERE clause, right? In essense, after the joins, we only want the results related to one type of coffee and for coffees in stock, right?

 

Correctamento.

 

As a random side note, the book uses the notation "ORDER by" at the end of the query, but I've always used "ORDER BY". I suppose it doesn't matter though.

 

ORDER BY, order by, order BY, ORDER by... It does not matter. But you should really make it easier to write your queries. Larry has limited space, but you should make it easy to read.

 

Look at something like this:

 

SELECT coffee.id, coffee.name, coffee.price, type.bean_name, type.roast, package.height, package.width, package.weight

FROM spesific_coffees AS coffee

INNER JOIN sizes AS package ON (coffee.id = package.size_id)

INNER JOIN general_coffees AS type ON (coffee.id = type.general_coffee_id)

WHERE (package.size_id = 4) && (type.general_coffee_id = 2)

ORDER BY coffee.name, coffee.price

 

Isn't this just WAAAY more simple to read? I've changed you names a lot here, but have tried to keep it logical.

 

Hope you are get something from this. A second time: Write queries in something like phpMyAdmin to check your results!

  • Upvote 1
Link to comment
Share on other sites

I liked your answers Antonio, they were really clear, I too need to brush up on my complex joins and found them helpful. Also I'm a big admirer of you, HartleySan, Matt and any others who are communicating in their second/third languages :)

Link to comment
Share on other sites

I've seen that many struggles with more advanced SQL-queries. I've thought about writing a long post about queries, but where does one start?

 

Mysqli is so full of functionality, but very little of it is normally used by PHP users. We like to manipulate data with PHP, often making things a lot harder for ourselves.

 

Thank you for the nice words, Jonathon. I've written on UK and English board since I was 12-13 years old, and it's been great practice for me. :)

 

 

Link to comment
Share on other sites

Great answers Antonio!

 

Inner joins does not filter data. It connects data. That is how you would want to look at it. Regarding your question: Think of the columns vertically. Another Inner Join will add columns to the right of the column that you are joining on. Coffee_id: 13 / Size: 0.5 (kilo) / Height: 17 (cm) / etc. Are you following:

 

Absolutely! I talked with HartleySan about this on the phone last night, and I kind of agreed with him that it was filtering data, but only to the extent that it was grabbing columns from the joined tables which matched the WHERE clause. You put it much better!

 

Also, HartleySan was asking me questions about queries like "How do you choose which table to start with?" I told him that I look at the first table in the select statement as the root, or home, table. This table, in my view, becomes the "left" table, and all joins are carried out to the "right" of that table - this may seem like a load of nonsense, and have no bearing on the "LEFT" or "RIGHT" used in a JOIN clause, but that is how I picture it in my mind.

 

For clarity, I always write joins in the order that the joined tables are connected to (and move away from) the "root" table, but I don't see why you couldn't put them in any order. In simple select queries with a join on a prominant table (i.e. `users` or `movies`) it is usually easy to determine which table is the "left" table. It is usually the "one" in a "one-to-many" relationship. Of course, the real answer of which table comes first depends on the question you are asking. If you want to find all the posts created by a user in a specific forum thread, then you would abviously start with the `users` table and join the `threads` and `posts` tables from there, with the WHERE clause being set to equal the "user_id".

 

I'm sure I'm telling you and Jonathon stuff you already know, and I hope it doesn't sound like a bunch of hot air! It's just the way I look at it and it works for me!

 

Matt

  • Upvote 1
Link to comment
Share on other sites

Thank you all for sharing your points of view. It's good to bring multiple ways of thinking about something to the table (pun intended); it gives me a more holistic view.

 

Also, for the record, I'm American, and English is my first language. Japanese is my second, and I live in Japan, thus the handle.

 

Anyway, I agree that joins aren't really "filtering" data (at least, not in the sense that WHERE is). But still, with an inner join, some filtering is going on. For example, if I have a customers table and an orders table, and a customer has never made an order (which is a bit odd to think about), then that customer will not be included in an inner join with the orders table. I mean, again, it's not a lot of filtering, as the tables should more or less match up for all entries, assuming that the database is properly normalized and foreign keys are used properly.

 

I also found the following interesting excerpt from MySQL's reference pages:

 

INNER JOIN and , (comma) are semantically equivalent in the absence of a join condition: both produce a Cartesian product between the specified tables (that is, each and every row in the first table is joined to each and every row in the second table).

(Source: http://dev.mysql.com/doc/refman/5.0/en/join.html)

 

This is right on track with the statement that Matt made a week or so ago in a related post. It's all starting to come together now.

 

Also, similar to what Matt said in his above post, the following are noted on the same page as was referenced above:

 

The NATURAL
JOIN of two tables is defined to be semantically equivalent to an INNER JOIN or a LEFT JOIN with a USING clause that names all columns that exist in both tables.

 

RIGHT JOIN works analogously to LEFT JOIN. To keep code portable across databases, it is recommended that you use LEFT JOIN instead of RIGHT JOIN.

 

So I guess in the end, like Matt said, it's good to think of your source table as the "left" table, and go from there. Although, with that said, if a database is properly designed, because a Cartesian product is the result of one or more inner joins, then the order *should* not matter, with the WHERE clause being the only limiting factor.

Link to comment
Share on other sites

Also, for the record, I'm American, and English is my first language. Japanese is my second, and I live in Japan, thus the handle.

 

Apologies HartleySan, though I'm sure you Japanese is also very good ;)

 

No Matt, not at all I liked your analogy, I've always found the area of which table should be first a little hazy. But I think i'm going to try your approach to it!

 

Thanks

Link to comment
Share on other sites

Thanks, guys, for this thread. It's a good one. The order in which you list the tables should not impact the results so long as you're using INNER joins and I generally agree with what Matt said about how he views this. One thing to consider, though, is performance: the faster you can narrow down the result, the better. I would recommend taking this experiment a step further and running different versions of the same join through EXPLAIN queries to see if MySQL thinks it would handle one ordering of tables better than another. You could benchmark the join in different orders, too, but benchmarks have a way of not being accurate.

Link to comment
Share on other sites

 Share

×
×
  • Create New...