Jump to content
Larry Ullman's Book Forums

Dimitri Vorontzov

  • Content Count

  • Joined

  • Last visited

Community Reputation

0 Neutral

About Dimitri Vorontzov

  • Rank
    Advanced Member
  1. Thanks, HartleySan, and you're absolutely right: now I'm attempting to define what would be a set of requirements that would validate reasonably large and most probable percentage of whatever comes before @ in an email address, and I'm right in the middle of research on that. Obviously, I'm not just sitting around waiting for someone to solve my problems: for the last few days I've been doing mostly research on regex. Thanks for the resource on lookarounds, it's indeed very valuable!
  2. Antonio, point taken, thank you. Ditto HartleySan, but to answer your question about what else I want to know, it's this: I'm well aware I'm caught in the newbie trap of trying to invent the perfect regex to match an email address. But I don't have the goal to match all possible versions of an email address, I just want to figure out the regex that would work with a majority of normal email addresses. The purpose of that "quixotic pursuit" is not to validate emails in any practical application, but rather to master the PHP flavor of regex, using email validation as example, for the la
  3. Thanks, Jonathon! I wasn't aware of regexlib.com – it's an excellent resource, and I appreciate your posting it. It's Larry who deserves full credit for "quixotic".
  4. Sure, Larry, I understand – and after all, it's your forum, isn't it? You can even ban me from it at a click of a button whenever you wish. But try to look at it in a different perspective. What if someone is searching Google for "regex to validate email" and finds this already popular thread in your forum? They will be attracted by the discussion, and then they may become curious: "who is this guy, Larry Ullman? Oh, interesting, he wrote quite a few books on PHP! And on other languages! Why don't I check them out!" And then that person will buy your books, read them, learn from them,
  5. Great. Now let's see if we can further improve it. The pattern I've arrived to (thanks to everyone's help!) is this: ^[\w.-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}$ It will validate email addresses like these: somename@somewebsite.com, somename@some-website.com, or somename@somewebsite.co.uk -- -- however, it will also validate these ugly things: somename@some......website.com, somename@--somewebsite..com -- and similar things. To fix that problem, we can try to replace the pattern after @ that includes some number of letters, numbers, periods and dashes: [A-Za-z0-9.-]+ -- with the pattern t
  6. Oh, Larry, when one day you finally meet me in person, as I hope, you will realize that wrath and yours truly are completely incompatible. I got the point about filter_var() being the preferred method, thank you very much for reinforcing it, Larry. Still, what I meant by regular expression that's designed to match an email address "doing the job" is this: validate only email addresses that may include letters, dots, dashes, numbers or underscores before @, letters, numbers, dashes and dots but NO underscores after @, and a certain reasonable number of letters after the last dot. No mo
  7. Thank you, Larry! This is an important clarification. I wasn't just idly waiting for your reply, and did a bit of research. Turns out, there are at least two more answers to my initial question that appear to be correct. Instead of this: ^[\w.-]+@[A-Za-z0-9\.\-]+\.[A-Za-z]{2,6}$ -- or this (without escape characters): ^[\w.-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}$ -- the same "perfect" email validation pattern apparently can be expressed a bit shorter, like this: ^[\w.-]+@[^\W_]+\.[A-Za-z]{2,6}$ -- or even like this: ^[\w.-]+@(
  8. Aha! That's new and interesting, thank you Larry. Also, I added a second question to the previous post, could you please take a look and comment? I'd be grateful for your input. I'm asking my questions because I really want to know, not just to be an ^a(ss|rse)hole$.
  9. Not berating at all: I respect everyone on this forum, and personally you, Larry. I just thought my question was clear from the beginning, namely: So I was a little taken aback by what I perceived as all the guys ignoring or misinterpreting the actual question, and persistently offering a workaround instead. I should have blamed myself instead and rephrased the question right away, but I truly thought what I was asking about was clear from the start. I also honestly thought that you were joking, when you posted the huge chunk of code above. I thought you implied the absurdity
  10. Hm... I think I can see how this may work, but something seems missing here. Aren't you supposed to have * instead of + after [000-\031]? Seriously though, Larry, I appreciate your humorous (even if slightly extreme) input very much, but... let's think calmly. The chapter on regular expressions in your book is teaching how to use regular expressions, not filter_var() function. My question is about regular expressions, not filter_var() function. I think it's fair of me to treat this forum with respect as a trustworthy source of information, ask questions about things that interest me,
  11. Thanks, HartleySan! Again, I'm fully with you and Antonio and Edward, as far as using filter_var, and not having to be perfect in most practical situations. But purely theoretically – this is a forum, dedicated to honing one's programming skills, and we can allow ourselves to play a little, can't we? So, my question to you, HartleySan. The solution that you proposed: [A-Za-z0-9] - am I correct in thinking that it would eliminate the following perfectly valid email address as invalid? somename@some-website.com And if yes, as it seems to be, than what would be a better sol
  12. You have a point here, Antonio. Still: perfection, in regex, I guess, is when it does exactly what it means to. If regex is written to check for a valid email address, it's supposed to check for a valid email address, no more, no less. This one doesn't do the job. I want the one that does. Yes, a user can submit something that is not an actual email address. But I don't want that user to submit something that is not a valid email address. There's a difference. When things don't do what they are supposed to be doing, next thing that usually happens, the entire civilization crumbles. We
  13. From purely practical point of view, Antonio, you're quite right, but I'm in pursuit of perfection with this one. How can we say "any character in \w, except _"?
  14. Thank you, Edward (cool avatar!) This is, of course, perfect solution – but I'm curious about how to improve this particular regex.
  • Create New...