« Blog Home

§ Validating email addresses (it's impossible) (Rant Alert!)

Search google for "validate email address" and you'll get about 303,000 results.  I bet all of them are wrong...

Short version:

Why bother?  I'm probably going to enter "example@example.com" anyway.

I wanted to search regexlib.com to see how many email validation regexes they had, but it's broken right now.  I have been there before and there are... lots and lots of them.  (I just tried again and it gave me 108.)

I once took over a project that only allowed email addresses ending with ".com", ".org", and ".net".  Wait a second – where did the internet come from?  What about ".edu", ".mil", or ".gov"?  ".info"?  ".museum"?  ".uk" or ".de"?  To top it off, the project had been started by someone outside the US!  (Here's the definitive list of top-level domains from the IANA site.)

Ok, the domain name part is defined well enough.  Err... try this: va.  (Even better, just type "va." in your address bar.)  No, not the Veterans Administration; it's the Vatican.  So is "pope@va" a valid email address?  I dunno.  The ones I spotted were something like "name@sub.va" so maybe not.

If you look at the RFCs at the IETF site (RFC822, obsoleted by RFC2822, but don't go there without some aspirin), you may be surprised at what can go in front of the @.  You could look at Wikipedia, but I like this one best:  I Knew How To Validate An Email Address Until I Read The RFC from Phil Haack.

Another link on Mr. Haack's post goes to RFC3696.  This one might be worth looking at.  Its title is Application Techniques for Checking and Transformation of Names, and it discusses various things about how internet names should work, do work in practice, what's defined, what's not defined...

Anyway, if you go there, you will see in section 4.3. The MAILTO URL:

 

   +-------------------------+-----------------------------+-----------+
   |      Email address      |         MAILTO URL          |   Notes   |
   +-------------------------+-----------------------------+-----------+
   |     Joe@example.com     |  mailto:joe@example.com     |     1     |
   |                         |                             |           |
   |  user+mailbox@example   |         mailto:             |     2     |
   |          .com           |  user%2Bmailbox@example     |           |
   |                         |          .com               |           |
   |                         |                             |           |
   |  customer/department=   |  mailto:customer%2F         |     3     |
   |  shipping@example.com   | department=shipping@example |           |
   |                         |          .com               |           |
   |                         |                             |           |
   |   $A12345@example.com   |  mailto:$A12345@example     |     4     |
   |                         |          .com               |           |
   |                         |                             |           |
   |  !def!xyz%abc@example   |  mailto:!def!xyz%25abc      |     5     |
   |          .com           |       @example.com          |           |
   |                         |                             |           |
   |  _somename@example.com  |  mailto:_somename@example   |     4     |
   |                         |          .com               |           |
   +-------------------------+-----------------------------+-----------+

 

These are all valid email addresses!  Go there and read the notes to see why.

I think you can also do:

"Anything You Want In Double Quotes `~!@#$%^&*()_+-\][{}';/.,<>?:"@[192.168.123.456]

I believe these are also valid:

first.last@[IPv6:1111:2222:3333:4444:5555:6666:12.34.56.78]
first.last@x23456789012345678901234567890123456789012345678901234567890123.example.com

And they get even worse; these are test cases from a validator project by Dominic Sayers.

I suppose you could be ultra cool and try to contact a mail server at whatever follows the @.

 

So, what do I do?  Well, I've been validating with this regex:

^.+@.+\..+$

meaning something@something.something

This can help catch a fumble-fingered typo, and at least it has to look something like a real email address.  While it allows invalid email addresses, it should allow anything that is valid...  unless "pope@va" is valid.

 

I just think it's far too complicated to bother with.  Remember the KISS principle?

Besides, if you force me to enter an email address, you're probably gonna get "example@example.com" anyway.

 

last edited on April 10th, 2010 at 3:32 AM

Comments

No Comments Here. Add yours below!

Add your comment

Name:
Email: (Will not be displayed - Privacy policy)
Website:
Comments:
  random image I can't read it!
Give me a different one!
Verify Post: Input the text from the image above to verify this post (helps prevent spam)
 

« Blog Home

Comments

“...Yeah, even in the slowness, a big program can be full of surprises.”
Pham Nuwen, A Fire Upon the Deep, Vernor Vinge