Email Pattern Harvesting is the systematic collection and validation of an organization's email and username structure. Attackers exploit public sources including LinkedIn profiles, GitHub commits, leaked databases, marketing materials, helpdesk portals, open contact forms, password reset pages, and conference speaker lists to map naming conventions.
Once attackers understand your patterns—such as firstname.lastname, f.lastname, or [email protected]—they can generate thousands of valid identity candidates. This reconnaissance drastically increases the success rate of password spraying, phishing campaigns, MFA fatigue attacks, and credential-stuffing operations, making it a critical precursor to broader identity compromise.
Common Sources
Public websites
LinkedIn profiles
GitHub commits
Leaked databases
Marketing materials
Helpdesk portals
Password reset pages
Conference listings
Attacker Objectives
Identity Mapping
Map organization username and email formats to understand naming conventions across departments and user types
User Differentiation
Distinguish contractors from employees, identify executive accounts, and locate privileged user patterns
Bulk Generation
Generate comprehensive UPN lists for spraying attacks and identify stale or legacy naming formats
Service Discovery
Enumerate service account naming patterns and detect shared mailbox naming logic for lateral movement
This pattern bridges Reconnaissance → Enumeration → Credential Acquisition, making it a foundational step in identity-based attacks.
Organizations inadvertently expose email patterns through common identity infrastructure misconfigurations. Understanding these weaknesses is essential for hardening your defensive posture against reconnaissance activities.
1
MC-001 — Publicly Exposed User Identifiers
Password reset pages and login portals reveal email structure through error messages and autocomplete behavior, allowing attackers to validate patterns without authentication.
2
MC-075 — Weak Network Segmentation
Open identity endpoints lack proper segmentation, leaking naming behavior via verbose error messages and timing differences in responses.
3
MC-146 — Inconsistent Identity Trust Boundaries
Federation endpoints expose UPN patterns indirectly through SAML responses, OAuth flows, and trust relationship metadata.