Jump to content
Larry Ullman's Book Forums

Stuart

Members
  • Posts

    141
  • Joined

  • Last visited

  • Days Won

    12

Posts posted by Stuart

  1. I don't know anything about creating encrypted connections to MySQL with SSH, however that article's referring to the connection details being sent in clear text - PHPMyAdmin is not inherently insecure. To eliminate this security hole you'd need to invest in secure hosting so your details are sent over SSL or use something like PUTTY but you won't have a nice GUI to work with then.

    • Upvote 1
  2. If you know the error that's being thrown from MySQL rather than just from PHP then you might be able to get to the bottom of it a little easier - just use:

     

    echo mysqli_error($dbc);

     

    If you just want to return the product of the two values in separate tables you could just do it in one query like:

     

    $q = "SELECT (table1.col2 * table2.col2) AS valueyouwant FROM table1, table2 WHERE table1.col1 = '3' AND table2.col1 = '4'";

     

    Just for reference if you want to assign it to an intermediate variable then I'd keep it inside MySQL. You can setup a MySQL variable that can be used across subsequent queries within the same connection and then use SELECT... INTO something like this:

     

    $q = "SELECT col1 INTO var FROM table1 LIMIT 1";

     

    Also just to add a wider point - SELECT... INTO is not supported by MySQL for tables only for variables - if you wanted to achieve this the MySQL method would be either INSERT INTO... SELECT or CREATE TABLE... SELECT.

    • Upvote 1
  3. So the literal match "" returns exact word combinations, a softer approach might be to use the + operator. So by prefixing your search term with a '+' you ensure that the word will be present in any row returned. By prefixing all search terms with a '+' is ensures they are present but not necessarily side by side. To get round the relevance issue you need to do something like:

     

    $q = "SELECT *, MATCH(description) AGAINST ('$find') AS relevance FROM products, doctors, types, subtypes WHERE doctors.doc_id = products.doc_id AND types.type_id = products.type_id AND subtypes.sub_id = products.sub_id AND MATCH (description) AGAINST ('$find' IN BOOLEAN MODE) ORDER BY relevance DESC";
    

     

    You might also want to add a HAVING clause to ensure only items with a relevance above a certain threshold (e.g. 50%) are returned. Hope that all makes sense. An easy way to prepend all your search terms with the + might be something like:

     

    $safe = mysqli_real_escape_string($dbc, trim($_GET['terms']));
    $find = '+' . str_replace(' ', ' +', $safe);
    

     

    Or if you wanted to be precise regarding spacing etc... you could use a REGEX.

    • Upvote 1
  4. I always find these articles interesting - I started reading lots of them after I read Larry's article on what makes a good programmer. One which caught my eye is:

     

    require_once() is 24% faster than include()

     

    I've seen benchmarking to support this but the tests looked shoddy and I've also read an article by Rasmus Lerdorf (the creator of PHP) talking about avoiding the use of _once functions:

     

    Try to avoid using include_once and require_once if possible. Sometimes there is no way around using these calls, but recognize that each one costs you an extra open() syscall and hash look up.

     

    If he says it it's good enough for me...

    • Upvote 1
  5. I've been working with fulltext searches a lot recently and they can be frustrating at times! If you wanted an exact phrase match for Bonn Forceps then you'd need to run your query in BOOLEAN mode and use literal text matching. Its just like searching in Google and wrapping your search phrase in "".

     

    So you query might become something like:

     

    $q = "SELECT * FROM products, doctors, types, subtypes WHERE doctors.doc_id = products.doc_id AND types.type_id = products.type_id AND subtypes.sub_id = products.sub_id AND MATCH (description) AGAINST ('\"$find\"' IN BOOLEAN MODE) ORDER BY prod ASC";
    

     

    Searching in boolean mode does have a few drawbacks mainly that they are not automatically ordered by relevance (although there are workarounds for this). Encouraging users to enter the boolean operators is a waste of time (we tried it) so just add them yourself before generating the query.

    • Upvote 2
  6. Thats the only problem I can see with your code - when you fixed it did you just press refresh in the browser? If so then you're just resubmitting the old POST data which will contain $_POST['email'] rather then the new $_POST['name']. Access the page fresh i.e. type the URL in again and hit enter. Then resubmit the details and it should work... unless I'm missing something.

    • Upvote 1
  7. Well I've managed to come up with a solution that kind of works - using the result of GROUP_CONCAT CAST to CHAR in the HAVING clause doesn't work because they become a string not a set of integers. The solution I've got is to run a WHERE... IN clause to get rows that match either category and then SELECT COUNT(DISTINCT habitat_type_id) AS habitats and add a HAVING clause like HAVING habitats = 2 (with 2 being the number of arguments passed in the WHERE... IN clause.

     

    So the query looks something like this:

     

    SELECT reference_id, COUNT(DISTINCT habitat_type_id) AS habitats, ( (0.2 * (MATCH(reference_title) AGAINST ('england' IN BOOLEAN MODE))) + (0.25 * (MATCH(reference_text) AGAINST ('england' IN BOOLEAN MODE))) + (0.05 * (MATCH(reference_tags) AGAINST ('england' IN BOOLEAN MODE))) + (0.3 * (MATCH(reference_authors) AGAINST ('england' IN BOOLEAN MODE))) + (0.2 * (MATCH(reference_keywords) AGAINST ('england' IN BOOLEAN MODE))) ) AS relevance FROM nn_references LEFT JOIN habitats_references USING (reference_id) WHERE (MATCH (reference_title, reference_text, reference_tags, reference_authors, reference_keywords) AGAINST ('england' IN BOOLEAN MODE)) AND habitat_type_id IN (4,11) GROUP BY reference_id HAVING relevance > 0 AND habitats = 2 ORDER BY relevance DESC

     

    I also since found a solution with JOINS in a post on intersection that seems to work. The only problem is that in theory a user can make a OR selection too:

     

    e.g. Initially choose a habitat type of foo OR bar - then once they start to refine the results using the checkboxes these become (foo AND baz) OR (bar AND baz)

     

    Add another refinement and in theory it would become:

     

    (foo AND baz AND qux) OR (bar AND baz AND qux)

     

    Am I right in thinking that now I'd have to essentially query twice and merge the result sets? Would I do this in MySQL or server side in PHP e.g. Write into an array using the document ID as the array key?

     

    Thank you for any help/guidance you can give

  8. Breaking in and out of PHP like a WordPress template would do the trick I think - so:

     

    <?php
    
       // Run query here
       // Get results into $row
    
    ?>
    
    <body>
    
       <div>
    
         <h1><?php echo $row['title']; ?></h1>
    
         <p><?php echo $row['body']; ?></p>
    
       </div>
    
     </body>
    

     

    The alternative is to echo out new line characters \n (inside "" not '') and tabs \t but I don't think you'd ever be able to get the indentation correct inline with your logic loops. Thats the only solution I know...

     

    Hope that helps

     

    Stuart Bates

    • Upvote 1
  9. So nice to be able to get rid of my old forum name :) but I'm having some MySQL problems and I really can't figure out - I keep thinking I must be missing something simple.

     

    I have a set of (37) fully normalised tables to store the data with integrity - these normalised tables drive a non-normalised search table that I conduct full text queries on. Fulltext searches from this database are extracted using INSERT INTO... SELECT to place them into a temporary table. The temporary table is then used to extract the rows required for pagination based on an INNER JOIN to the relevant tables and to calculate the number of total results for pagination. (COUNT(*) was not used for this because I have to extract all the IDs into the temporary table anyway so MYSQLI_AFFECTED_ROWS seemed like a more efficient solution).

     

    The reason I have to store them in a temporary table is because we then have to run a series of COUNT queries to determine which categories the results fall into and present these back with a series of checkboxes. The user can then select multiple sub-categories from the sidebar to narrow down their search. At this point I think an example is needed:

     

    1. The user does a generic search for England
    2. Search returns 330 results with a series of refinement checkboxes for habitat types: (Grassland - 48, Shrubland - 45, Wetlands - 33)
    3. User can make one or multiple selections from the checkboxes

     

    If they make just one selection a WHERE clause similar to: habitat_type_id = 4 is inserted into the query - which works fine. But if the user makes two selections (either at the same time or after the first filter has been added) then it doesn't work. The query becomes eg. habitat_type_id = 4 AND habitat_type_id = 11. This won't return anything because the row being examined will only ever have one value in. Next I tried a WHERE... IN clause but that returns documents that have a habitat_type_id of 4 or 11 not both.

     

    I thought maybe aliasing the columns was the way to go but I've had no luck - I've also tried a rather hacky implementation of GROUP_CONCAT and HAVING to try and extract them. Seems there must be a simpler technique/query!?

     

    I think that's enough information but if you need to know anything else just ask - I've included the query which breaks below:

     

    RETURNS: Nothing

     

    INSERT IGNORE INTO tmp_references SELECT reference_id, ( (1.75 * (MATCH(reference_title) AGAINST ('england' IN BOOLEAN MODE))) + (1 * (MATCH(reference_text) AGAINST ('england' IN BOOLEAN MODE))) + (0.75 * (MATCH(reference_tags) AGAINST ('england' IN BOOLEAN MODE))) + (1 * (MATCH(reference_authors) AGAINST ('england' IN BOOLEAN MODE))) + (2 * (MATCH(reference_keywords) AGAINST ('england' IN BOOLEAN MODE))) ) AS relevance FROM nn_references LEFT JOIN habitats_references USING (reference_id) WHERE (MATCH (reference_title, reference_text, reference_tags, reference_authors, reference_keywords) AGAINST ('england' IN BOOLEAN MODE)) AND habitat_type_id = 4 AND habitat_type_id = 11 GROUP BY reference_id HAVING relevance > 0 ORDER BY relevance DESC

     

    RETURNS: Documents matching either not both

     

    INSERT IGNORE INTO tmp_references SELECT reference_id, ( (1.75 * (MATCH(reference_title) AGAINST ('england' IN BOOLEAN MODE))) + (1 * (MATCH(reference_text) AGAINST ('england' IN BOOLEAN MODE))) + (0.75 * (MATCH(reference_tags) AGAINST ('england' IN BOOLEAN MODE))) + (1 * (MATCH(reference_authors) AGAINST ('england' IN BOOLEAN MODE))) + (2 * (MATCH(reference_keywords) AGAINST ('england' IN BOOLEAN MODE))) ) AS relevance FROM nn_references LEFT JOIN habitats_references USING (reference_id) WHERE (MATCH (reference_title, reference_text, reference_tags, reference_authors, reference_keywords) AGAINST ('england' IN BOOLEAN MODE)) AND habitat_type_id IN (4,11) GROUP BY reference_id HAVING relevance > 0 ORDER BY relevance DESC

     

    Thank you in advance guys

     

    PS Execution speed is of the essence with this project :s not sure if that makes any difference to the solution.

×
×
  • Create New...