Most people validate e-mail addresses in a normal programming language, such as PHP and that is a very good idea when you process user input and such, however sometimes you want to validate data already entered in a database.
Take this example: you have a comments table where each person can enter an e-mail address, but since it’s an optional field you don’t even validate the input. However when you need to send a newsletter you don’t want to process all fields and validate one by one, you want to use a single query and just get all e-mail addresses (eventually by adding additional rules as well), so here’s how you can do it in MySQL:
SELECT *
FROM `m_comment`
WHERE `comment_url`
REGEXP ‘(.*)@(.*)\.(.*)’
Sure, you can optimize the query, but it’s a simple regular expression which .. well, in most cases is good enough.
Not good enough, IMHO, because that Regexp will match “@.” too!
To me, the absolute minimum is, at least,
'(.+)@(.+)\.(.{2,4})'
. 🙂Yeah, your regexp expression is indeed better, the first one I wrote was the bare minimum, you can find a much optimized one of course, but on a real situation it was good enough (yours is better indeed).
It doesn’t work properly. It ignores the . , if there is a . in it or not. It always passes 🙁
Please help! 😀
=
Thank U. it worked for me.
A way better regular expression:
‘^[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9._-]@[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]\.[a-zA-Z]{2,4}$’
Regex with MySQL is slightly different, you need to “double protect” special cars, here’s the good one :
^[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9._-]@[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]\\.[a-zA-Z]{2,4}$
SELECT * FROM user WHERE email NOT REGEXP ‘^[a-zA-Z0-9][+a-zA-Z0-9._-]*@[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]*\\.[a-zA-Z]{2,4}$’
handles gmail address with +
handles email addresses where the host is a single letter
tested against approximately 8000 user records on a production database
=
When I have to do a select, it is too late, because I then have already invalid data in the database. Validation must happen before an insert, not after.