Last active
July 6, 2021 19:54
-
-
Save vkbandi/2b3d89fff9dfad116743 to your computer and use it in GitHub Desktop.
C# code to extract Email from text
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using System; | |
using System.Collections.Generic; | |
using System.Text; | |
using System.Text.RegularExpressions; | |
namespace Coderbuddy | |
{ | |
public class ExtractEmail | |
{ | |
public List<string> ExtractEmails(string textToScrape) | |
{ | |
Regex reg = new Regex(@"[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}", RegexOptions.IgnoreCase); | |
Match match; | |
List<string> results = new List<string>(); | |
for (match = reg.Match(textToScrape); match.Success; match = match.NextMatch()) | |
{ | |
if (!(results.Contains(match.Value))) | |
results.Add(match.Value); | |
} | |
return results; | |
} | |
} | |
} |
@mbonafede Replace line number 12 in the gist with the following code and it will work for you
Regex reg = new Regex(@"[a-zA-Z0-9._%+-]+@[a-zA-Z]+(\.[a-zA-Z0-9]+)+", RegexOptions.IgnoreCase)
Please note this code is a bit dated, with ICANN now allowing multiple different TLD's including unicode based one's, this code may not work for all the domains out there.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
if you use an email line name.lastname@domain.com.ar it does not work.