Does this work for international email formats (umlauts, accented characters)?

It detects standard ASCII email addresses (the vast majority). Internationalized email addresses (IDN) with non-ASCII local parts are not extracted by default. They're rare in practice.

Can it find emails wrapped in HTML or markdown?

Yes. The pattern doesn't care about surrounding markup. It scans raw text for the email pattern wherever it appears.

What about obfuscated emails like 'name [at] domain [dot] com'?

Toggle on "Detect obfuscated emails" and we'll also catch [at], (at), [dot], (dot) patterns and reconstruct the address.

Is my pasted text uploaded anywhere?

No. The regex runs in your browser. Nothing is sent to any server.

Extract · Deduplicate · Export

Email Extractor — Pull Emails from Any Text

Paste any text and get a clean, deduplicated list of every email address.

Paste any text containing email addresses

Paste text and click Extract emails.

Output

Show domain breakdown

About the Email Extractor

Pull every email address out of any text. Pasted from documents, scraped from web pages, exported from messaging apps. Using regex matching that handles edge cases naive matchers miss. Output as a comma-separated list, line-by-line list, or unique-only set. Everything runs locally in your browser; nothing is uploaded.

What email extraction is

The job is straightforward to describe: scan input text, find substrings that look like email addresses, output the list. The complications are the same as everything in regex work. What counts as an email address depends on which spec you follow, and the strict spec (RFC 5322) is so permissive that practical regex matching uses a more conservative rule.

The strict definition allows things almost no real email server accepts: quoted local parts, parenthesized comments, IP-address-based domains, dots inside quoted local parts. The pragmatic definition this tool uses matches what real email systems actually deliver to: local@domain.tld where the local part is letters, digits, dots, hyphens, plus signs, and underscores; the domain is letters, digits, hyphens, and dots; and the TLD is two or more letters.

This catches every email address you'll encounter in practice, while filtering out false positives like example.@ or @example.com that the strict spec technically allows.

Real use cases

Extracting contacts from a meeting notes document. Notes from a sales call mention five attendees and three follow-ups by name and email. Pasting the notes through the extractor pulls all eight addresses cleanly, skipping mentions of names without emails.

Pulling addresses from a forwarded email. An email thread with To, From, CC, BCC headers and inline mentions throughout the body. Extract all addresses for compliance review or for adding to a CRM.

Building a mailing list from a newsletter signup form export. Some lightweight signup forms (Notion forms, Google Forms with email fields) export submissions as one combined block of text per submission. Extract just the emails into a list, deduplicate with Remove Duplicates.

Auditing a large document for inappropriate email exposure. A privacy review needs to find every email address embedded in a public document, contract, or web archive. Extract the list, review for sensitivity, redact what shouldn't be public.

Parsing CSV exports from CRMs and tools. When a CSV has email addresses scattered across multiple columns (customer email, contact email, CC email), copy the whole CSV and extract. Faster than identifying which columns to combine.

Scraping legitimately-public contact information. Conference attendee pages, public team directories, and similar legitimately-public contact lists. Extract once, use for outreach. (Don't use this for spam scraping; respect privacy and unsubscribe requests.)

Cleaning up auto-converted emails from chat or messaging apps. When pasted from Slack or Discord, email addresses sometimes have extra characters attached (markdown link syntax, surrounding angle brackets, escape sequences). The extractor pulls just the email substring, leaving the noise behind.

Confirming an email exists in a long communication thread. Search-then-extract is faster than visual scanning for "did this person email us at any point?"

Output format options

Three output formats cover almost every downstream use.

One per line. The default. Easy to read, easy to paste into mailing list tools, easy to manipulate further with line-based tools (Sort, Remove Duplicates).

Comma-separated. Inline format suitable for pasting directly into the To: field of an email client. Most email clients accept comma-separated address lists.

Semicolon-separated. Outlook and some enterprise email systems prefer semicolons over commas. Use this format if your tool requires it.

Each format also offers a "unique only" toggle that deduplicates the result before output. Default is to preserve all matches in input order, which catches duplicate occurrences that may signal something interesting (the same address mentioned multiple times in a long thread).

Email validation patterns and edge cases

The pattern this tool uses catches the practical population of real email addresses while excluding common false positives.

Plus addressing like name+tag@example.com is supported. Plus addresses are valid and increasingly common. Gmail, Outlook, and FastMail all support them.

Dotted local parts like first.last@example.com are matched. Both single dots and multiple are allowed in the local part.

Subdomains in domains like name@mail.example.co.uk work. Multi-level domains and country-code TLDs are supported.

Hyphens in domains like name@my-company.com work. Hyphens are valid in domain names per the DNS spec.

What's not matched (intentionally): quoted local parts ("john doe"@example.com), IP-address domains (name@[192.168.1.1]), single-letter TLDs, and other edge cases that almost never appear in practice. These are valid per RFC 5322 but rejected by most real email infrastructure.

Punctuation at boundaries is excluded from matches. An address followed by a comma or period in prose ("contact name@example.com,") doesn't include the trailing punctuation in the extracted email. The address is just name@example.com.

Common pitfalls

Pasted text with line wrapping. Sometimes copy-paste from PDFs introduces line breaks in the middle of email addresses (example@ company.com). The extractor doesn't see this as a single email. To fix, run the input through Remove Spaces (remove line breaks mode) first, then extract.

Obfuscated addresses. Some websites display emails with deliberate obfuscation: name [at] example [dot] com or name@example..com or name (at) example (dot) com. The extractor doesn't try to deobfuscate; if it doesn't look like a real email, it's not extracted.

Email-like strings that aren't emails. github@v1.2 looks like an email but isn't (the TLD 2 is invalid). The extractor's TLD requirement (2+ letters) filters most of these. x@y.cc would match. Short TLDs exist (.cc, .tv, .io). But standalone 2 doesn't.

Unicode email addresses. Internationalized email addresses (café@münchen.de) are valid per spec but rare in practice. The current extractor doesn't match them; ASCII-only addresses are the supported set. If you specifically need Unicode email handling, use a dedicated tool.

Privacy and consent. Just because email addresses are in text you have access to doesn't mean you have permission to email them. Respect anti-spam laws (CAN-SPAM in the US, GDPR in Europe, CASL in Canada). Extract for legitimate purposes only.

Email Extractor vs grep vs custom regex

This tool. Fastest for ad-hoc extraction in a browser, no syntax to remember, output formats already structured.

grep with regex. grep -oE '[A-Za-z0-9._+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}' file.txt. Best for large files, automation, and pipelines.

Custom regex in your own tool or script. Needed when you want to validate emails (not just extract them), enforce stricter rules, or integrate with other processing.

How the tool works

Paste text into the input box. The tool runs a regex pattern against the entire input, collecting every match. Output is then formatted according to your selected mode (line-separated, comma-separated, semicolon-separated) and optionally deduplicated. Match count is reported so you can sanity-check the result.

Performance scales linearly with input size. Multi-megabyte inputs extract in well under a second.

Workflow tips

Always dedupe extracted lists. Real text usually contains duplicates. The unique-only output option saves a step.

Lowercase emails before further processing. Email addresses are case-insensitive in the local part for all practical purposes (no real mail server rejects mail based on local-part case). Run the extracted list through Case Converter (lowercase) before deduplication if you want to catch Name@Example.COM and name@example.com as the same address.

Verify before sending. Extracted emails sometimes include obvious false matches (typos, domain references that look like emails). Eyeball the list before sending mass email; the cost of one bounce is small, but the reputation cost of bouncing many is significant.

Combine with Sort Lines for reviewability. Alphabetical order makes a long extracted email list easier to scan for outliers and duplicates.

Frequently asked questions

Will it find emails in any document?

If the email addresses are present as plain text, yes. Encrypted documents, image-only PDFs, and obfuscated emails (e.g., "name [at] example [dot] com") aren't matched.

Does it validate that the emails actually exist?

No. Extraction finds strings that look like email addresses syntactically. Whether the address actually exists or accepts mail is a different operation that requires DNS lookups and SMTP probes. Not something a browser-based tool can do.

What about plus addresses?

Supported. name+filter@example.com is correctly extracted as name+filter@example.com.

Will it handle Unicode emails?

Not currently. The pattern matches ASCII addresses only. Internationalized domain names and local parts (rare but spec-valid) require a different pattern.

Are duplicate emails removed?

Only if you select the unique-only output option. Default behavior preserves all matches in input order, which sometimes signals useful information (the same address appearing 50 times might be the email of the document's author rather than a customer).

Is using this tool to scrape emails ethical?

Depends on the source and your purpose. Extracting emails from documents you have permission to use is fine. Scraping emails from web pages without permission, then sending unsolicited mail, violates anti-spam laws in most jurisdictions and is widely considered unethical.

Tool

Email Extractor — Pull Emails from Any Text

About the Email Extractor

What email extraction is

Real use cases

Output format options

Email validation patterns and edge cases

Common pitfalls

Email Extractor vs grep vs custom regex

How the tool works

Workflow tips

Frequently asked questions

Will it find emails in any document?

Does it validate that the emails actually exist?

What about plus addresses?

Will it handle Unicode emails?

Are duplicate emails removed?

Is using this tool to scrape emails ethical?

Related

Find & Replace

Remove Duplicates

Sort Lines

Email Extractor: Free vs Paid Tools

How to Extract Emails from Text

Email Regex Cheatsheet

Email Extractor — Pull Emails from Any Text

About the Email Extractor

What email extraction is

Real use cases

Output format options

Email validation patterns and edge cases

Common pitfalls

Email Extractor vs grep vs custom regex

How the tool works

Workflow tips

Frequently asked questions

Will it find emails in any document?

Does it validate that the emails actually exist?

What about plus addresses?

Will it handle Unicode emails?

Are duplicate emails removed?

Is using this tool to scrape emails ethical?

Related

Find & Replace

Remove Duplicates

Sort Lines

Email Extractor: Free vs Paid Tools

How to Extract Emails from Text

Email Regex Cheatsheet

Learn more about email extractor

Email Extraction: The Complete Guide

How to Extract Emails from Any Text

Email Regex Cheatsheet

Hunter.io vs Free Email Extractors