Jump to content
Larry Ullman's Book Forums

Recommended Posts

Larry,

 

Happy New Year!  I know my posts are often not directly related to the topics in a particular book, but this one is definitely an "advanced" topic, and one which needs to be dealt with by a credible individual (there are scattered S.O. posts on the subject, but none of which provides a definitive solution for the problem).  This is a conundrum that has been bothering me for a while, but I want a best practice for updating a many-to-many table involving check box data.

 

Basically, Jon and I are working on an update form where users can add/delete train stations where they want to work (the stations appear in a list and users can check/un-check the stations they want to work at).  In the user_stations table we store the user_id, the train_line_id, and the station_id.  I have done a lot of research on this and there are a couple of different ways of dealing with updating a many-to-many check box table.

 

1)  The easiest way is simply to delete all rows in the user_stations table where the user_id is equal to the logged in user's id and then insert the new train station data for the given user.  This way seems to be the cleanest and most secure, but it would mean erasing everything and rewriting it again which could add unnecessary workload to the database.

 

2)  A clever solution I have seen is to take the array of data as it was originally and compare it to the data which is being updated.  All data which was in the original array, which is not in the new array, is then added to a "delete" array.  All data which is in the new array, which was not in the original array, is added to the "add" array.  Then, two queries are executed: one DELETES all data which is in the "delete" array, and another which INSERTS all data which is in the "add" array.  This way seems more efficient, but requires a lot more processing on the PHP side.

 

What is the best way to update a many-to-many table with check box data?  This is something that I think a lot of other people would also benefit a lot from, and I am shocked that so little information is out there!

 

As a side note - Larry has done an awesome job of explaining almost every area of PHP/mysql web development there is, and I can think of nobody better to come up with a workable solution (i.e. a detailed and comprehensive blog post) to this problem :)

 

Thanks in advance,

 

Matt

Link to comment
Share on other sites

I think most of the complexity here is related specifically to the checkbox design. If you changed the approach to something like toggled buttons with "Would work at" / "Not interested" and updating the list using AJAX, the design would be dead simple. While DB performance is important, this is not really expensive operations to perform. A series of small inserts/deletes might also work faster than a larger transaction. Due to a much simpler design, I would prefer this solution myself. (And have done in the past)

 

The array solution should also be viable. Unless we are talking tens of thousands of stations, you'll most likely not experience any huge problems. Try to develop a small test to see how long the computation takes. My bet is that it'll not really be a bottleneck.

Link to comment
Share on other sites

Thanks for the nice words. You've got the pros and cons of the two approaches down. I'm not sure there's anything I can add. I guess for larger sites with lots of data, I'm inclined to go the array route. For smaller sites with less data, a simple DELETE and INSERT. I don't have a hard rule here. 

 

A third option is to use a non-relational database, and you don't have to worry about this at all. But that's a whole different kettle of fish!

Link to comment
Share on other sites

Thank you very much for the responses!

 

@Antonio - I really like the idea of using AJAX to update the train data.  If a user is going to add/delete train stations where they can work, in most cases it will only be a relatively small amount as they would have already chosen most of them when they filled out the registration form.  The AJAX approach sounds clean and fast (if a check box is unchecked then delete it, and if it is checked then insert it).  My only concern with this is that it would still require a non-javascript solution for those who have js disabled.  And you are right that we are not dealing with a large amount of stations, although a user could theoretically add hundreds if they wanted to.  The site is in Japan, and as you may know, trains are the most common mode of transportation here.  In a metropolitan area like Tokyo, there are probably over a hundred stations from a person's home station within an hour's commute, but I think in most cases a user won't select that many.

 

btw - Hartley-san wrote all the backend logic for the original registration form!  He did a brilliant job on it!  I am taking the same form and turning it into an update form.  If you want to see the original form I can send you a PM with the address and directory username/password.

 

@Larry - I am starting to lean towards the array approach as well!  I think it is probably better because there could conceivably be lots of records in the database for a given user, and deleting all of them and reinserting them just to add one train station is absurd.  Either way, the whole index is going to have to be rebuilt every time the table is updated, and this will certainly slow down the query a bit.  Is this something I should be worried about?

 

Thanks again,

 

Matt

Link to comment
Share on other sites

Matt: I think you should read a little bit about interaction design. It will give you some tools and a process for finding the best possible UI design. I've very successfully used human-computer interaction to design interfaces for a car retailer system before. If you have the money, you could of course hire someone to do the job for you. Bad cognitive models in a UI is often a reason system don't get used, no matter how good they are.

 

As for my suggestion, I was actually thinking of a adding a single, toggle-able button on each station and sorting the rows into groups of "selected" and "not selected" stations. A click on the button would insert/remove that specific station with AJAX immediately. Think of the "Like"-button on Facebok as an example.

The fallback for non-JS here is sing GET with the user_id, train_line_id and station_id on the buttons. (which are really links)

 

If this is a problem with things taking too long, I would look at implementing a job queue with workers running in the background.

Link to comment
Share on other sites

I don't think a job queue on the back-end would be necessary, since the common practice these days with small Ajax interactions like the ones you mention is to show immediate success on the UI side, even though the Ajax interaction hasn't actually had time to go through yet, and then only change the success indication in the rare case that a failure occurs.

 

This is often more than adequate, as it allows the user to perceive what they see as an instantaneous change, but also allows you to log each click separately, and always have the DB reflect the most current state.

For the non-JS version, I'd more or less just stick with what you already have, even though it won't be quite as good an experience as the JS version. That's pretty typical though. The fact that you're providing a non-JS site is more than most sites do these days anyway.

 

As a final thought, I agree with Larry that just using the array_diff function or whatever to calculate the rows to insert or delete would be best.

Good luck!

  • Upvote 1
Link to comment
Share on other sites

Thank you for the advice guys!  I really appreciate it!

 

The next problem I have to deal with is all the nested table updates.  Only one uses an actual UPDATE query, but the rest use DELETE/INSERT to update the data (because they are many-to-many relations).  Because the user probably won't update data that will affect every table, I can't use mysqli_stmt_affected_rows() to check whether a specific update was successful or not (and move on to the next query).

 

Any thoughts on this?

 

Thanks,

 

Matt

Link to comment
Share on other sites

I think you're just going to have to check which data was changed, and based off of that, update the corresponding tables accordingly.

 

For example, if just the user's name is changed, that's just an update on the users table, but if both the user's name and the train data is updated, then you'll have to update the users table and update the train data table with DELETE/INSERT queries.

 

Really, I think it's going to come down to something like the following:

if (any-user-data-changed) {
  // Update users table.
}

if (train-data-changed) {
  // Update train data table.
}

Beyond that, I can't think of any easy way to know what tables to update.

I'm worried that maybe I misunderstood your question. Did I properly understand your question?

Link to comment
Share on other sites

I think that too sounds like a design issue, either in the user interface form designs or in the database. Could you possibly explain your problems a little more in detail?

 

The best thing you can do is to make sure your related inserts/updated/deletes are done as a single operation. You can do this by concatenating Strings into a single query or imploding an array and combining inserts and select queries. (Look at WHERE key IN (a, b, c)

 

To make sure everything works as expected, you should go with transactions. With the right level of modularity and high-level abstractions, the code should be very easy to follow. The key here is to go down the model route in MVC, even if you don't actually apply the pattern elsewhere. Calling bool updateUserStations( $userId, array $keys ) that wraps transaction handling, calls deleteRemovedStationsFromUser( $id, $keys['delete']), insertNewUserStations( $userId, $keys['insert'] ) and makeShiftCleanup() will make sure you can develop things in small steps and make sure it all works. Even if I don't go down the MVC route, I always write query functions instead of mixing DB and logic in the scrips themselves.

Link to comment
Share on other sites

Antonio, Matt didn't fully explain everything (as that would take forever), but we are using transactions (with prepared statements), and have wrapped all of our DB operations in simply-to-use functions.

In other words, we're using our own flavor of MVC, and have properly modularized everything.

 

Also, all deletes are performed in a single query, and all inserts are performed in a separate single query.

Unless I'm missing something obvious though, I don't see anyway to combine those two queries.

 

I guess my point is, we are being as efficient as possible.

Thanks a lot for the advice though.

Link to comment
Share on other sites

That's why I pointed out it might be an interface issue rather than a programming issue at that. I also don't know Matt's programming skills and how you guys split the work-load. I'll just underline I'm not trying to insult anyone. Dismiss my advice if it's not appropriate/needed.

 

Nice to hear it's going fine programming-wise though!

 

To explain further, have you thought about separate interfaces and/or forms for managing stations? Instead of both adding and deleting stations in one fell-swoop, you might benefit from dividing the two operations. That might also be a easier mental model to understand for your users, and would be simpler to program.

 

I don't try to insult you, or step on any toes, by giving very basic advice. However, without any specific details to work with, I get the feeling you might have overlooked some UI design and data flow that may simplify your application code. This is why I pointed to Interaction Design as a process that will help you in that regard. Non of us are really designers when push comes to show. (Hope I use that correctly)

Link to comment
Share on other sites

Antonio, I completely understand how you feel.

90% of the time, the people asking questions on these forums don't give nearly enough information, and you also have to consider their general programming ability as well to help answer the question properly.

As such, I totally understand why you said what you said in your previous posts.

 

With that said, I can personally vouch for Matt.

He is an excellent PHP programmer and designer. He has a great understanding of data structures, security and programming logic, and so when he asks a question on these forums, it's usually very specific, and it's looking for an equally specific answer.

That's not to imply that his original question was completely clear, and that it's impossible to not be potentially confused. And like I said, you'd have no way of knowing the specifics of Matt's style of programming. I just happen to know Matt personally, so I know his personality. And furthermore, we've been working together on this project, so I have a lot more context than anyone else.

 

To answer another of your questions, Matt is doing a majority of the workload. I'm kinda helping his as an informal consultant, but he's doing a majority of the work.

 

Also, we both appreciate your advice (and Matt and I discussed that very thing the other day between us), but because you don't know the details of the project, I did feel the need to step in a couple of times and set the record straight, per se, so that you could better understand what was going on.

I will readily admit that I was a bit curt with you in my responses, and to that, I humbly apologize. I totally want you to chime in whenever you feel appropriate. Thank you.

 

Lastly, I see what you're saying about separate interfaces for adding and deleting train stations. That certainly is a consideration.

Personally, the approach I often take to organizing code/creating interfaces is to write a single function or whatever that just does everything I want. I'll keep all the relevant code together as one block until I come across a situation where I need a piece of the code for something else.

At that point, I will then separate that piece of code, and somehow modularize it. In essence, if I only need some code once, I pretty much lump it together, but if I need a block of code more than once, then I pull it out of the flow and modularize it.

To that end, because the only use case for these interfaces at the moment is handling train station updates for one particular form, we have decided to keep all the code together for now.

 

Anyway, thanks again for your two cents.

Link to comment
Share on other sites

Glad you take it that way Jon and Matt. As a non-native to English, the sublities in communication is harder for me textually. No need for an apology to me. :)

 

It looks like we still have a communication breakdown though. I'm not talking about coding in reference to interfaces, but to the graphical UI that drives the data flow around. Does that make more sense?

Link to comment
Share on other sites

Oh, I understand now. So you're saying have a separate section on the screen for adding stations versus deleting?

 

I suppose that's possible, but personally, I don't like it.

I think it's less intuitive (i.e., worse UX), and also, it clashes with how we do it everywhere else on the site. Everywhere else on the site, we have one set of train data that can be interacted with.

Besides, having it all together simply means that we may have to do both INSERT and DELETE queries in one request. Hardly a big deal. And if we hook up a JS solution as well, then each click will be a separate request, thus eliminating that concern.

Link to comment
Share on other sites

I wasn't here for a couple of days and all hell breaks loose!

 

Jon, yes you understood my question correctly!  I think I have a solution for the updates.  I think you're right about having them in series, but the problem is that if any one query fails, then only certain sections will get updated.  This may confuse the user, so I decided it is better to keep track of errors at each step.  This way we can rollback if anything fails.  Any thoughts?

 

Antonio, please don't worry at all about any misunderstandings.  I know you are a great coder and value your input highly!

 

As far as the form, Jon and I discussed the implementation of the train stations in depth before he coded it and it's probably about as intuitive as you can make it!  I had spent several days studying web form design best practices, and everything from the line spacing to the size of the buttons has been taken into account as well.

 

Matt

Link to comment
Share on other sites

Instead of trying to keep track of where a failure occurred (in the event that it does), I would just use a transaction so that the entire series of queries fails at once if any part fails.

 

While doing so may require the user to do a bit more work in the rare event of a failure, I think it's actually more intuitive for the user to redo everything, as opposed to having them try to figure out which part failed, in an effort to preserve some of their work.

And of course, rolling back the entire transaction will make things a million times easier for DB integrity.

 

I guess at the end of the day, I'd do this:

- For non-JS users, do the whole form in one submission, and if anything fails, too bad, they have to redo everything.

- For JS users, when you have the time, make it so that an Ajax request is made per click on train station data.

Link to comment
Share on other sites

Jon,

 

I agree that using a transaction is the best way to do it!  Doing all the queries one after the other, and rolling back if any of them fails, makes sense.

 

Also, for non-JS users, doing the whole form in one submission is great.  Adding Ajax requests only to the train data is a whole different story, as we would also have to deal with the rest of the form submission separately.

 

What do you think?

Link to comment
Share on other sites

 Share

×
×
  • Create New...