Ruby Email validation with regex

The name of the picture


Ruby Email validation with regex



I have a large list of emails I am running through. A lot of the emails have typos. I am trying to build a string that will check valid emails.



this is what I have for regex.


def is_a_valid_email?(email)
(email =~ /^(([A-Za-z0-9]*.+*_+)|([A-Za-z0-9]+-+)|([A-Za-z0-9]++)|([A-Za-z0-9]++))*[A-Z‌​a-z0-9]+@{1}((w+-+)|(w+.))*w{1,63}.[a-zA-Z]{2,4}$/i)
end



It passes if an email as underscores and only one period. I have a lot of emails that have more then one periods in the name itself. How do I check that in regex.


hello.me_1@email.com # <~~ valid
foo.bar#gmail.co.uk # <~~~ not valid
f.o.o.b.a.r@gmail.com # <~~~valid
f...bar@gmail.com # <~~ not valid
get_at_m.e@gmail #<~~ valid



Can someone help me rewrite my regex ?





Possible duplicate of stackoverflow.com/questions/201323/…
– CAustin
Apr 10 '14 at 16:43





Refer here for creating your RegEx.
– tenub
Apr 10 '14 at 16:58




9 Answers
9


VALID_EMAIL_REGEX = /A[w+-.]+@[a-zd-]+(.[a-zd-]+)*.[a-z]+z/i



You seem to be complicating things a lot, I would simply use:


VALID_EMAIL_REGEX = /A[w+-.]+@[a-zd-]+(.[a-z]+)*.[a-z]+z/i



which is taken from michael hartl's rails book



since this doesn't meet your dot requirement it can simply be ammended like so:


VALID_EMAIL_REGEX = /A([w+-].?)+@[a-zd-]+(.[a-z]+)*.[a-z]+z/i



As mentioned by CAustin, there are many other solutions.



EDIT:



it was pointed out by @installero that the original fails for subdomains with hyphens in them, this version will work (not sure why the character class was missing digits and hyphens in the first place).


VALID_EMAIL_REGEX = /A[w+-.]+@[a-zd-]+(.[a-zd-]+)*.[a-z]+z/i





That is so much simpler then what I had. Thank you.
– T0ny lombardi
Apr 10 '14 at 17:02





How can I add this validation for email_field? Currently, it is only checking for presence of @. I want that it verifies presence of . as well.
– sshah
Jan 19 '16 at 13:40


email_field


@


.





@sshahwhat do you mean by email_field? this regex checks that the email is something_valid@somewhere.tld, (see the . parts in the second part of the regex.)
– Mike H-R
Jan 21 '16 at 18:28




email_field


something_valid@somewhere.tld


.





@MikeH-R hmmm, that regex (Michael Hartl's) returns valid for only @. Is that a valid email?
– Mohamad
Jan 29 '16 at 0:34


@





@Mohamad the regex really shouldn't be matching just @ by itself (though it could be argued as by John Carney below that that would make a more accurate match for an email). all of the groups with + require one or more matches. E.g. [w+-.]+ at the start will match a or aaaa or a+b. but not the empty string. See here for a demonstration
– Mike H-R
Jan 29 '16 at 0:51




@


+


[w+-.]+


a


aaaa


a+b.



Here's a great article by David Celis explaining why every single regular expression you can find for validating email addresses is wrong, including the ones above posted by Mike.



From the article:



The local string (the part of the email address that comes before the
@) can contain the following characters:


`! $ & * - = ` ^ | ~ # % ' + / ? _ { }`



But guess what? You can use
pretty much any character you want if you escape it by surrounding it
in quotes. For example, "Look at all these spaces!"@example.com is a
valid email address. Nice.



If you need to do a basic check, the best regular expression is simply /@/.


/@/





why the downvotes?
– Scott Stafford
Apr 28 '15 at 19:23





at a guess ... even though virtually anything can be in an email provided its quoted properly, in reality 99.99% of emails follow a reasonably standard format, and many systems will barf when handed an address they don't recognise as valid (even if it is). If you've got such a component then making sure the email address is reasonable as well as valid is important - particularly if it's part of a legacy system or something that can't be changed/updated.
– Dave Smylie
Sep 28 '16 at 23:06



This one is more short and safe:


/A[^@s]+@[^@s]+z/



The regular is used in Devise gem.
But it has some vulnerabilities for these values:


".....@a....",
"david.gilbertson@SOME+THING-ODD!!.com",
"a.b@example,com",
"a.b@example,co.de"



I prefer to use regexp from the ruby library URI::MailTo::EMAIL_REGEXP


URI::MailTo::EMAIL_REGEXP



There is a gem for email validations



Email Validator





Thanks for pointing me to URI::MailTo::EMAIL_REGEXP! Feels like the best approach since that may be better maintained than dumping a custom regexp somewhere in a codebase.
– carp
May 24 '17 at 8:01


URI::MailTo::EMAIL_REGEXP



This has been built into the standard library since at least 2.2.1


URI::MailTo::EMAIL_REGEXP





'aa@aaa' =~ URI::MailTo::EMAIL_REGEXP Doesn't work with this case.
– Benjamin
Jun 19 at 12:16




'aa@aaa' =~ URI::MailTo::EMAIL_REGEXP





Nice catch, it seems that's intentional: rubydoc.info/stdlib/uri/URI/MailTo html.spec.whatwg.org/multipage/input.html#valid-e-mail-address
– Joshua Hunter
Jun 20 at 1:01





Nowadays Ruby provides an email validation regexp in its standard library. You can find it in the URI::MailTo module, it's URI::MailTo::EMAIL_REGEXP.
In Ruby 2.4.1 it evaluates to


URI::MailTo


URI::MailTo::EMAIL_REGEXP


/A[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*z/



But I'd just use the constant itself.





This was written 3 years ago and if I recall, I was still using Ruby1.9. Possibly thats the reason why I didn't know about it? Thank you for your 1 liner though.
– T0ny lombardi
Oct 16 '17 at 12:27





Yeah, but three years later people still answer with their custom regular expressions. Anyway, I did not intend to attack you nor anyone else. I've changed the tone of my answer accordingly.
– kaikuchn
Jan 31 at 16:35



I guess the example from the book can be improved to match emails with - in subdomain.


-


VALID_EMAIL_REGEX = /A[w+-.]+@[a-zd-]+(.[a-zd-]+)*.[a-z]+z/i



For example:


> 'some@email.with-subdomain.com' =~ VALID_EMAIL_REGEX
=> 0





Ahhh, I didn't see your answer until now, this was what I added to my answer.
– Mike H-R
Aug 15 '16 at 11:02



Yours is complicated indeed.


VALID_EMAIL_REGEX = /A[w+-.]+@[a-zd-.]+.[a-z]+z/i



The above code should suffice.



Explanation of each piece of the expression above for clarification:



Start of regex:


/



Match the start of a string:


A



At least one word character, plus, hyphen, or dot:


[w+-.]+



A literal "at sign":


@



A literal dot:


.



At least one letter:


[a-z]+



Match the end of a string:


z



End of regex:


/



Case insensitive:


i



Putting it back together again:


/A[w+-.]+@[a-zd-.]+.[a-z]+z/i



Check out Rubular to conveniently test your expressions as you write them.



try this!!!



/[A-Z0-9._%+-]+@[A-Z0-9.-]+.[AZ]{2,4}/i


/[A-Z0-9._%+-]+@[A-Z0-9.-]+.[AZ]{2,4}/i



only email string selected


"Robert Donhan" <bob@email.com>sadfadf
Robert Donhan <bob@email.com>
"Robert Donhan" abc.bob@email.comasdfadf
Robert Donhan bob@email.comadfd



This works good for me:


if email.match?('[a-z0-9]+[_a-z0-9.-]*[a-z0-9]+@[a-z0-9-]+(.[a-z0-9-]+)*(.[a-z]{2,4})')
puts 'matches!'
else
puts 'it doesn't match!'
end






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Keycloak server returning user_not_found error when user is already imported with LDAP

Using generate_series in ecto and passing a value

PHP parse/syntax errors; and how to solve them?