Jump to content
Larry Ullman's Book Forums

Dimitri Vorontzov

  • Content Count

  • Joined

  • Last visited

Everything posted by Dimitri Vorontzov

  1. Thanks, HartleySan, and you're absolutely right: now I'm attempting to define what would be a set of requirements that would validate reasonably large and most probable percentage of whatever comes before @ in an email address, and I'm right in the middle of research on that. Obviously, I'm not just sitting around waiting for someone to solve my problems: for the last few days I've been doing mostly research on regex. Thanks for the resource on lookarounds, it's indeed very valuable!
  2. Antonio, point taken, thank you. Ditto HartleySan, but to answer your question about what else I want to know, it's this: I'm well aware I'm caught in the newbie trap of trying to invent the perfect regex to match an email address. But I don't have the goal to match all possible versions of an email address, I just want to figure out the regex that would work with a majority of normal email addresses. The purpose of that "quixotic pursuit" is not to validate emails in any practical application, but rather to master the PHP flavor of regex, using email validation as example, for the la
  3. Thanks, Jonathon! I wasn't aware of regexlib.com – it's an excellent resource, and I appreciate your posting it. It's Larry who deserves full credit for "quixotic".
  4. Sure, Larry, I understand – and after all, it's your forum, isn't it? You can even ban me from it at a click of a button whenever you wish. But try to look at it in a different perspective. What if someone is searching Google for "regex to validate email" and finds this already popular thread in your forum? They will be attracted by the discussion, and then they may become curious: "who is this guy, Larry Ullman? Oh, interesting, he wrote quite a few books on PHP! And on other languages! Why don't I check them out!" And then that person will buy your books, read them, learn from them,
  5. Great. Now let's see if we can further improve it. The pattern I've arrived to (thanks to everyone's help!) is this: ^[\w.-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}$ It will validate email addresses like these: somename@somewebsite.com, somename@some-website.com, or somename@somewebsite.co.uk -- -- however, it will also validate these ugly things: somename@some......website.com, somename@--somewebsite..com -- and similar things. To fix that problem, we can try to replace the pattern after @ that includes some number of letters, numbers, periods and dashes: [A-Za-z0-9.-]+ -- with the pattern t
  6. Oh, Larry, when one day you finally meet me in person, as I hope, you will realize that wrath and yours truly are completely incompatible. I got the point about filter_var() being the preferred method, thank you very much for reinforcing it, Larry. Still, what I meant by regular expression that's designed to match an email address "doing the job" is this: validate only email addresses that may include letters, dots, dashes, numbers or underscores before @, letters, numbers, dashes and dots but NO underscores after @, and a certain reasonable number of letters after the last dot. No mo
  7. Thank you, Larry! This is an important clarification. I wasn't just idly waiting for your reply, and did a bit of research. Turns out, there are at least two more answers to my initial question that appear to be correct. Instead of this: ^[\w.-]+@[A-Za-z0-9\.\-]+\.[A-Za-z]{2,6}$ -- or this (without escape characters): ^[\w.-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}$ -- the same "perfect" email validation pattern apparently can be expressed a bit shorter, like this: ^[\w.-]+@[^\W_]+\.[A-Za-z]{2,6}$ -- or even like this: ^[\w.-]+@(
  8. Aha! That's new and interesting, thank you Larry. Also, I added a second question to the previous post, could you please take a look and comment? I'd be grateful for your input. I'm asking my questions because I really want to know, not just to be an ^a(ss|rse)hole$.
  9. Not berating at all: I respect everyone on this forum, and personally you, Larry. I just thought my question was clear from the beginning, namely: So I was a little taken aback by what I perceived as all the guys ignoring or misinterpreting the actual question, and persistently offering a workaround instead. I should have blamed myself instead and rephrased the question right away, but I truly thought what I was asking about was clear from the start. I also honestly thought that you were joking, when you posted the huge chunk of code above. I thought you implied the absurdity
  10. Hm... I think I can see how this may work, but something seems missing here. Aren't you supposed to have * instead of + after [000-\031]? Seriously though, Larry, I appreciate your humorous (even if slightly extreme) input very much, but... let's think calmly. The chapter on regular expressions in your book is teaching how to use regular expressions, not filter_var() function. My question is about regular expressions, not filter_var() function. I think it's fair of me to treat this forum with respect as a trustworthy source of information, ask questions about things that interest me,
  11. Thanks, HartleySan! Again, I'm fully with you and Antonio and Edward, as far as using filter_var, and not having to be perfect in most practical situations. But purely theoretically – this is a forum, dedicated to honing one's programming skills, and we can allow ourselves to play a little, can't we? So, my question to you, HartleySan. The solution that you proposed: [A-Za-z0-9] - am I correct in thinking that it would eliminate the following perfectly valid email address as invalid? somename@some-website.com And if yes, as it seems to be, than what would be a better sol
  12. You have a point here, Antonio. Still: perfection, in regex, I guess, is when it does exactly what it means to. If regex is written to check for a valid email address, it's supposed to check for a valid email address, no more, no less. This one doesn't do the job. I want the one that does. Yes, a user can submit something that is not an actual email address. But I don't want that user to submit something that is not a valid email address. There's a difference. When things don't do what they are supposed to be doing, next thing that usually happens, the entire civilization crumbles. We
  13. From purely practical point of view, Antonio, you're quite right, but I'm in pursuit of perfection with this one. How can we say "any character in \w, except _"?
  14. Thank you, Edward (cool avatar!) This is, of course, perfect solution – but I'm curious about how to improve this particular regex.
  15. I have a question about email matching with regular expressions. Chapter 14, Pg. 445, contains the following email matching pattern: ^[\w.-]+@[\w.-]+\.[A-Za-z]{2,6}$ I may be wrong, but wouldn't it match something like this? somename@some_website.com
  16. Thank you very much, HartleySan. That's exactly what I wanted. If you don't mind, I may ask you a couple more questions about that later.
  17. Thank you, HartleySan - would it be too insolent of me to ask you to show me how this can be done?
  18. I have a question, inspired by the "Random Quotes" mini-application, used as an example of writing to a file and reading from a file, in Chapter 11 of this book. The "read from file" part of this application picks a random quote from an array, and outputs it to a web browser. I want to figure out the simplest way to improve this script, so that the random quotes that it outputs are not repeated, as long as there are quotes left in the file. For example, the script has output the array element # 0. I want it to remember that the element # 0 has been output, and the next time to o
  19. Chapter 11, pg. 301, contains the following text: If you are running PHP 5.1 or greater, you can add the LOCK_EX constant as the third argument to file_put_contents(): file_put_contents($file, $data, LOCK_EX); To use both the LOCK_EX and FILE_APPEND constants, separate them with the binary OR operator (|): file_put_contents($file, $data, FILE_APPEND | LOCK_EX); It doesn’t matter in which order you list the two constants. This is a really interesting solution, and I'm surprised my mind didn't registered it when I was reading that book for the first time. Could s
  20. Chapter 11, pg. 293 includes the following text: "When appending data to a file, you normally want each piece of data to be written on its own line, so each submission should conclude with the appropriate line break for the operating system of the computer running PHP. This would be ■ \n on Unix and Mac OS X ■ \r\n on Windows" I'm pretty sure this is the first time the distinction between \n for Unix/Max and \r\n for Windows is mentioned in this book. Does this mean that \n wouldn't work on a Windows-based server? Could someone please elaborate on this?
  21. P.S. As a corollary to that - if I want to include a non-php file into another file, and want to prevent its content from being viewable in a browser, perhaps I can save that file with .php extension?
  • Create New...