A recent bug in my Java SMTP client led me down the fun path of figuring out how to conditionally split a string of email-addresses using commas in Java.
Since Sun has deprecated Tokenizers, the following RegEx Java split on a comma separated string normally does the trick:
Java:
recipientsArr = recipientsStr.split("\\,");
However, what if you only want to split if commas are not inside quotes or double quotes? Hrmm... tricky.
For example, I want to split the following string to create 2 valid email addresses, not 3 invalid ones:
"Leon, J" <j@email.com>, "M" <j@email.com>
We'll need some fancy Regular Expression goodness. Trouble is, I am not that great with RegEx grammar. Luckily, Neal Ford is. Very good. In his tutorial about "Power" regexes, he has a great example on how to conditionally match a comma if it's not inside quotes: RegEx:
,(?=([^']*'[^']*')*(?![^']*'))
So, all that's left is to change it to look for double quotes, and make sure the Java compiler escapes the quotes properly. Final Solution:
recipientsArr = recipientsStr.split( ",(?=([^\"]*\"[^\"]*\")*(?![^\"]*\"))" );
Thursday, December 13, 2007
Splitting Hairs (Or Comma Separated Values) in Java ...
Labels:
conditional CSV split,
email addresses,
Java,
RegEx,
splitting strings
Subscribe to:
Post Comments (Atom)
2 comments:
`wow this was giving me a splitting headache and this post was exactly what I needed, thanks!
Glad it helped, Sanj!
Post a Comment