Jump to content
Larry Ullman's Book Forums

Regular Expressions


Recommended Posts

Hello,

 

Could someone help me with a regular expression in MySQL?

 

If I test the following pattern in PHP:

\b(\')?((eau){1}|(eaux){1})\b

 

the results are fine since I get a match for

eau

l'eau OR l'eau coule

d'eau OR cours d'eau

les eaux OR les eaux ruissellent

 

but no match for

carreaux (the word ends with 'eaux', but "eau" or "eaux" is not the beginning of the word, and doesn't occur just after an apostrophe)

beaucoup ('eau' or 'eaux' is neither the beginning nor the end of the word)

 

Yet if I try the very same pattern in the database, using phpMyAdmin, I get no results at all although all the above examples are in the database.

 

Could someone explain why this is happening, and what I need to do in MySQL?

 

In case you wonder why I'm not using a FULLTEXT search, I plan to use regular expressions for words that are 1, 2 or 3 characters long, and a FULLTEXT search for longer words or expressions.

 

With thanks for your help,

Link to comment
Share on other sites

Thank you for answering, Larry. No, apparently it isn't a collation or character set issue (for once!). It's apparently a problem with word-boundaries since this query returns no result:

SELECT id_syntagme, syntagme_c FROM syntagmes WHERE syntagme_c REGEXP '\b(\')?((eau){1}|(eaux){1})\b'

 

whereas this query works… except that it returns all the entries containing "eau" or "eaux", including "carreaux" or "beaucoup" (which I need to exclude):

SELECT id_syntagme, syntagme_c FROM syntagmes WHERE syntagme_c REGEXP '(\')?((eau){1}|(eaux){1})'

 

Is there another way of specifying word-boundaries in a MySQL query?

 

With thanks for your help,

Link to comment
Share on other sites

I've just been looking at the MySQL 5.1 Manual, and I found the explanation here:

http://dev.mysql.com.../en/regexp.html

 

 

In the MySQL version of regular expressions, \b means a backspace. And word-boundaries are instead shown like this:

REGEXP '[[:<:]]word[[:>:]]'

 

So, rewriting my previous regular expression like this:

REGEXP '[[:<:]](\')?((eau){1}|(eaux){1})[[:>:]]'

returned just the results I wanted.

  • Upvote 1
Link to comment
Share on other sites

 Share

×
×
  • Create New...