Jump to content
Larry Ullman's Book Forums

Recommended Posts

Hello,

 

Could someone help me with a regular expression in MySQL?

 

If I test the following pattern in PHP:

\b(\')?((eau){1}|(eaux){1})\b

 

the results are fine since I get a match for

eau

l'eau OR l'eau coule

d'eau OR cours d'eau

les eaux OR les eaux ruissellent

 

but no match for

carreaux (the word ends with 'eaux', but "eau" or "eaux" is not the beginning of the word, and doesn't occur just after an apostrophe)

beaucoup ('eau' or 'eaux' is neither the beginning nor the end of the word)

 

Yet if I try the very same pattern in the database, using phpMyAdmin, I get no results at all although all the above examples are in the database.

 

Could someone explain why this is happening, and what I need to do in MySQL?

 

In case you wonder why I'm not using a FULLTEXT search, I plan to use regular expressions for words that are 1, 2 or 3 characters long, and a FULLTEXT search for longer words or expressions.

 

With thanks for your help,

Share this post


Link to post
Share on other sites

Thank you for answering, Larry. No, apparently it isn't a collation or character set issue (for once!). It's apparently a problem with word-boundaries since this query returns no result:

SELECT id_syntagme, syntagme_c FROM syntagmes WHERE syntagme_c REGEXP '\b(\')?((eau){1}|(eaux){1})\b'

 

whereas this query works… except that it returns all the entries containing "eau" or "eaux", including "carreaux" or "beaucoup" (which I need to exclude):

SELECT id_syntagme, syntagme_c FROM syntagmes WHERE syntagme_c REGEXP '(\')?((eau){1}|(eaux){1})'

 

Is there another way of specifying word-boundaries in a MySQL query?

 

With thanks for your help,

Share this post


Link to post
Share on other sites

I've just been looking at the MySQL 5.1 Manual, and I found the explanation here:

http://dev.mysql.com.../en/regexp.html

 

 

In the MySQL version of regular expressions, \b means a backspace. And word-boundaries are instead shown like this:

REGEXP '[[:<:]]word[[:>:]]'

 

So, rewriting my previous regular expression like this:

REGEXP '[[:<:]](\')?((eau){1}|(eaux){1})[[:>:]]'

returned just the results I wanted.

  • Upvote 1

Share this post


Link to post
Share on other sites

Ah, excellent. Kudos for figuring it out and thanks for posting the solution. Apologies for not getting an answer to you sooner myself.

Share this post


Link to post
Share on other sites

Hello, Larry,

 

Please don't apologize. There's no reason why you should do the research if you don't know the answer off the top of your head. :)

 

Best wishes,

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...