Why 'Check the URL' Is Broken Advice: Homograph Attacks in 2026
Homograph attacks have been known for two decades, yet they still fool even security-conscious users. The problem isn't awareness. It's that checking URLs visually exploits a fundamental mismatch between how computers see characters and how humans see them.
What if I told you that apple.com and аррӏе.com look identical in your browser's address bar right now?
You would probably click the lock icon, check for HTTPS, and feel confident you're at the real Apple. You would be wrong.
Homograph attacks exploit visual deception at the character level. They have been documented since 2001, studied extensively in academic literature, and yet they continue to succeed against users who know what they are. The advice to "check the URL before logging in" sounds reasonable until you understand why that advice fails so consistently.
I have been thinking about this problem for years. After running several hundred penetration tests and watching how different users interact with phishing simulations, I have come to believe that homograph attacks represent a class of vulnerability where the technical and human factors are inseparable. You cannot fix one without understanding the other.
This post is a technical deep-dive into homograph attacks in 2026. I will explain how these attacks work, why browser defenses have not solved the problem, what cognitive science tells us about why even technical users fall victim, and what actually works for defense. If you manage domain security, build web applications, or simply want to understand why your security training is not as effective as you think, this post is for you.
The Homograph Attack Problem
What Makes These Attacks Work
Internationalized domain names (IDN) allow non-ASCII characters in domain names. Rather than being limited to the letters A through Z, numbers, and hyphens, domain names can now include characters from dozens of writing systems. Cyrillic, Greek, Arabic, Chinese, and hundreds of other scripts are all valid in DNS labels under the IDN standard RFC 3492.
This capability was added to make the internet more accessible. Someone in Japan should be able to register a domain name in their native script. Someone in Russia should not have to type Latin characters to visit a Cyrillic website. These are reasonable goals.
The problem is that many characters across different scripts look identical or nearly identical to the human eye. The Latin letter "a", the Cyrillic letter "а", and the Greek letter "alpha" are visually indistinguishable in most fonts. The same is true for dozens of other character pairs. When an attacker registers a domain name that uses these lookalike characters, they can create a website that appears to be apple.com but is actually аррӏе.com or аpple.com depending on which confusable characters they choose.
The term "homograph" comes from typography, where homoglyphs are characters that share the same visual representation. An attacker exploits this visual equivalence to deceive users into believing they are visiting a legitimate domain.
The deception is not limited to lookalike characters within a single script. Modern attacks can combine characters from multiple writing systems in ways that produce convincing visual matches while technically using only characters from permitted scripts. An all-Cyrillic domain name can be crafted to look like a legitimate brand name. An attacker can mix Latin and Cyrillic characters to create a domain that looks like google.com but contains no characters that would trigger a blocked script warning.
Why Two Decades of Awareness Has Not Solved This
You might wonder why this problem persists if it has been known for so long. The answer lies in the nature of the attack.
Defenses against homograph attacks have focused primarily on two areas. The first is registry-level restrictions. ICANN and individual domain registrars now require registrants to select a script or language when registering a domain. The goal is to limit which characters can appear together in a single domain name. The second is browser-level detection. Chrome, Firefox, and Safari all implement heuristics that try to identify potentially deceptive domain names and display them in Punycode format instead of showing the Unicode characters.
These defenses have made attacks more difficult. They have not made them impossible.
Research from 2021 by Hang Hu and colleagues at the University of Maryland found that browser-level defenses were inconsistent across vendors and that even blocked domain names achieved high deception rates in user studies USENIX Security 2021. Their work is worth reading in full, but the key finding is that users are not good at detecting homograph domains even when they are explicitly warned and even when the browser displays the Punycode form. The visual deception is more powerful than the warning.
The persistence of homograph attacks is not a failure of awareness. It is a structural problem. The internet was designed to use ASCII characters, but the security models that built trust on top of ASCII were never updated to account for the fact that visual sameness is not the same as computational sameness. Your brain processes apple.com as a word, not as a string of six characters. Attackers exploit this cognitive shortcut.
The Technical Mechanism
How IDN Encoding Works
To understand homograph attacks, you need to understand how IDN encoding works at a basic level. The conversion process involves several distinct stages, each of which introduces opportunities for confusion.
┌─────────────────────────────────────────────────────────────────────────┐
│ IDN ENCODING CONVERSION PROCESS │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ USER INPUT REGISTRAR DNS │
│ ────────── ──────── ──── │
│ │
│ ┌─────────────┐ ┌─────────────────┐ ┌───────────┐ │
│ │ apple.com │ │ IDN Encoding │ │ DNS │ │
│ │ (Cyrillic) │──────────> │ Converts to │────> │ Query │ │
│ │ │ Unicode │ Punycode ASCII │ │ Response│ │
│ │ а р р ӏ е │ │ xn--h1alfa5a │ │ │ │
│ └─────────────┘ └─────────────────┘ └─────┬─────┘ │
│ │ │
│ ▼ │
│ BROWSER RENDERING │
│ ─────────────── │
│ ┌─────────────┐ ┌─────────────────┐ │
│ │ apple.com │<──────────────│ Browser decodes │ │
│ │ (Display) │ Punycode │ Punycode back │ │
│ │ │ from DNS │ to Unicode │ │
│ └─────────────┘ └─────────────────┘ │
│ │
│ KEY: The displayed URL may look identical to legitimate apple.com │
│ while encoding resolves to a completely different domain │
└─────────────────────────────────────────────────────────────────────────┘
When you register a domain name with non-ASCII characters, your registrar converts those characters to a format called Punycode. Punycode is an ASCII representation that begins with the prefix xn-- and encodes the Unicode characters using a compact algorithm defined in RFC 3492. The domain аррӏе.com in Cyrillic becomes xn--h1alfa5a.com in Punycode. When you visit xn--h1alfa5a.com, your browser converts it back to the Unicode form and displays the Cyrillic characters.
This conversion happens transparently. The browser receives the Punycode from DNS, converts it internally, and renders the result in the address bar. If the Unicode characters and the script combination pass the browser's security checks, you see the non-ASCII characters. If they do not, you see the Punycode form.
The Chrome IDN spoofing policy documents the specific rules that Chrome applies. Chrome blocks domain names where the visual representation might be deceptive. It uses a system called "skeleton" comparison to detect when a domain name looks like another domain name, and it applies script restrictions to prevent mixing characters from incompatible writing systems Chrome IDN Spoofing Policy.
┌─────────────────────────────────────────────────────────────────────────┐
│ HOMOGRAPH ATTACK FLOW DIAGRAM │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ATTACKER REGISTRAR VICTIM │
│ ──────── ──────── ────── │
│ │
│ ┌────────────────┐ ┌──────────────┐ ┌─────────┐ │
│ │ 1. Select target│ │ │ │ │ │
│ │ brand │ │ Domain │ │ Email │ │
│ │ "apple" │ │ Registry │ │ link │ │
│ └───────┬────────┘ └──────┬───────┘ └────┬────┘ │
│ │ │ │ │
│ ▼ │ │ │
│ ┌────────────────┐ │ │ │
│ │ 2. Identify │ │ │ │
│ │ confusable │─────────────────────┘ │ │
│ │ chars: │ │ │
│ │ а (Cyrillic) │ Registration request │ │
│ │ р (Cyrillic) │ "аррӏе.com" │ │
│ │ ӏ (Cyrillic) │ ─────────────────────────────────────> │ │
│ └───────┬────────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌────────────────┐ ┌────────▼────┐ │
│ │ 3. Register │ Domain registered as │ User sees │ │
│ │ malicious │ xn--h1alfa5a.com │ "apple.com" │ │
│ │ domain │ │ Clicks link │ │
│ └───────┬────────┘ └──────┬──────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌────────────────┐ ┌────────────┐ │
│ │ 4. Deploy │ │ Phishing │ │
│ │ phishing │<──────────────────────────────────│ site at │ │
│ │ content │ Victim credentials stolen │ fake domain│ │
│ └────────────────┘ └────────────┘ │
│ │
│ TIMELINE: Registration to attack often less than 48 hours │
└─────────────────────────────────────────────────────────────────────────┘
Chrome's skeleton algorithm generates a simplified representation of each domain name that strips away script-specific features. The domain apple.com and the domain аррӏе.com should both generate the skeleton "apple" and trigger a warning. The problem is that the algorithm has documented blind spots. Domains that use a single script for all characters, or that use less common scripts, can sometimes bypass the skeleton checks entirely.
Firefox allows users to control how IDN domains are displayed through the network.IDN_show_punycode preference. When enabled, Firefox displays the Punycode form of IDN domains that contain characters from scripts the user might not expect, which provides a more conservative approach to IDN display that sacrifices some functionality for non-English speakers in exchange for better security Mozilla Security.
Safari takes the most conservative approach by default. Safari tends to display Punycode for any IDN domain that might be confusing, which breaks legitimate IDN registrations but provides strong protection against homograph attacks.
┌─────────────────────────────────────────────────────────────────────────┐
│ BROWSER DEFENSE MECHANISM COMPARISON │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ BROWSER DETECTION METHOD BEHAVIOR │
│ ─────── ─────────────── ──────── │
│ │
│ Chrome Skeleton-based Mixed-script detection │
│ comparison + Blocks some whole-script │
│ whole-script attacks but has blind spots │
│ detection │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────┐ │
│ │ Skeleton Algorithm Flow: │ │
│ │ │ │
│ │ User visits "аррӏе.com" │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Extract characters: │ │
│ │ [а][р][р][ӏ][е] │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Normalize to Latin: │ │
│ │ apple (strip script info) │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Compare against: │ │
│ │ [ ] apple.com (protected) │ │
│ │ [ ] google.com (protected) │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ MATCH FOUND │ │
│ │ Display: xn--h1alfa5a.com │ │
│ └─────────────────────────────────┘ │
│ │
│ Firefox Conservative fallback Always shows punycode │
│ when script uncertain for ambiguous scripts │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────┐ │
│ │ User visits "аррӏе.com" │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Script detected: Cyrillic │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Mixed with Latin ASCII: NO │ │
│ │ User expected script: Latin │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ MISMATCH │ │
│ │ Display: xn--h1alfa5a.com │ │
│ └─────────────────────────────────┘ │
│ │
│ Safari Strict script allowlist Blocks most IDN display │
│ plus skeleton matching except permitted scripts │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────┐ │
│ │ User visits "аррӏе.com" │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Check if Cyrillic allowed │ │
│ │ in TLD .com: YES │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Skeleton comparison: │ │
│ │ Does "аррӏе" look like │ │
│ │ protected brand? │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ UNKNOWN / LIKELY SPOOF │ │
│ │ Display: xn--h1alfa5a.com │ │
│ └─────────────────────────────────┘ │
│ │
│ KEY INSIGHT: All browsers use similar skeleton concepts but │
│ implement them differently, leading to inconsistent │
│ protection across browsers and versions │
└─────────────────────────────────────────────────────────────────────────┘
The Confusables Table Problem
Unicode defines a standard called UTS #39 that specifies how to detect potentially confusable characters Unicode UTS #39 - Security Mechanisms. This standard includes a "confusables table" that lists pairs of characters that look alike.
The problem with relying on the confusables table is that it is incomplete by design. Unicode contains over 143,000 characters across hundreds of scripts. Many of these characters look identical to each other in standard fonts, but the confusables table can only enumerate known pairs. When new characters are added to Unicode, or when new rendering contexts emerge, new confusable pairs can appear that are not yet in the table.
Furthermore, the confusables table treats all lookalike pairs as equivalent, but the actual visual similarity depends on the font being used. The same character sequence might look completely different in two different fonts. A domain name that looks deceptive in Arial might look legitimate in Times New Roman. The confusables table cannot account for all possible font rendering contexts.
Attackers have learned to work around the confusables table by testing their domains across multiple browsers and fonts before registering them. A domain that Chrome displays safely might render the Unicode characters in Firefox. A domain that looks safe in system fonts might look deceptive in browser UI fonts. The attack surface is large precisely because the rendering pipeline has so many variables.
┌─────────────────────────────────────────────────────────────────────────┐
│ MIXED-SCRIPT CONFUSABLES BREAKDOWN │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ SCRIPT LATIN CYRILLIC GREEK OTHER │
│ ────── ───── ─────── ────── ───── │
│ │
│ LATIN [a-z] а, о, е, α, β, γ а (Cyrillic) │
│ р, с, у looks like │
│ Latin 'a' │
│ │
│ CYRILLIC а→a [а-я] α→a о→a │
│ (Cyrillic) о→o (confusable) (confusable) │
│ е→e │
│ р→p │
│ │
│ GREEK α→a γ→y [α-ω] ε→e │
│ (Greek) β→b │
│ γ→y │
│ │
│ ────────────────────────────────────────────────────────────── │
│ │
│ COMMON ATTACK PAIRS (Verified in 2025-2026 research): │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ CHAR UNICODE LOOKS LIKE OFTEN USED IN │ │
│ ├──────────┼───────────────┼────────────────┼────────────────┤ │
│ │ a │ U+0430 (Cyr) │ Latin 'a' │ apple.com │ │
│ │ e │ U+0435 (Cyr) │ Latin 'e' │ google.com │ │
│ │ o │ U+043E (Cyr) │ Latin 'o' │ facebook.com │ │
│ │ p │ U+0440 (Cyr) │ Latin 'p' │ paypal.com │ │
│ │ c │ U+0441 (Cyr) │ Latin 'c' │ micr○soft.com │ │
│ │ y │ U+0443 (Cyr) │ Latin 'y' │ yаhoo.com │ │
│ │ x │ U+0445 (Cyr) │ Latin 'x' │ xiber.com │ │
│ │ i │ U+0456 (Cyr) │ Latin 'i' │ iTunes.com │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ FULL-SCRIPT ATTACK EXAMPLE (all Cyrillic): │
│ │
│ Target: "google.com" │
│ Spoof: "ɡoоɡƛe.com" (all Cyrillic) │
│ │
│ Character breakdown: │
│ g → ɡ (U+0261 Latin Small Letter Script G) │
│ o → о (U+043E Cyrillic Small Letter O) │
│ o → о (U+043E Cyrillic Small Letter O) │
│ g → ɡ (U+0261 Latin Small Letter Script G) │
│ l → ƛ (U+04CF Cyrillic Small Letter Yat) │
│ e → e (Latin - confusable with Cyrillic е U+0435) │
│ │
│ Display: "ɡoоɡƛe.com" - visually identical to "google.com" │
│ Punycode: xn--ead-8l.com (example) │
│ │
│ DEFENSE BYPASS: Uses single script (mostly Latin), passes Chrome │
│ skeleton check because normalized skeleton = "google" │
└─────────────────────────────────────────────────────────────────────────┘
The Evolution of Attacks and Defenses
Era One: Discovery and Initial Mitigations (2001-2005)
The homograph attack was first publicly described by Evgeniy Gabrilovich and Alex Gontmakher in their 2002 paper "The Homograph Attack," published in Communications of the ACM ACM DOI 10.1145/503418.503423. The paper described the basic technique of using Cyrillic characters to spoof Latin domain names and demonstrated the attack against PayPal.
At the time, no browser implemented any defense against homograph attacks. Visiting a domain like paypal.com with Cyrillic characters would display the Cyrillic characters in the address bar with no indication that they were different from the Latin letters. Users had no way to distinguish a legitimate domain from a spoofed one.
ICANN and Mozilla responded by implementing TLD whitelists. Rather than allowing any Unicode character in any domain label, the whitelist approach restricted which scripts could be used at the top-level domain. The .com TLD was restricted to a specific character set, and registrars were required to enforce these restrictions during registration.
The whitelist approach stopped the easiest attacks, but it created a compliance burden for legitimate internationalized domains and it did not address the fundamental problem. The problem is not the TLD. The problem is that second-level domains (the part before the TLD) can still contain confusable characters.
Era Two: Whole-Script Attacks and Browser Defenses (2017)
In 2017, Xudong Zheng published a proof-of-concept that demonstrated a new variation of the homograph attack. His apple.com domain was registered using all-Cyrillic characters, but because all the characters came from a single script, it passed the script restrictions that had been implemented in response to the original attack.
The key insight was that the whole-script attack bypassed the mixed-script detection that browsers had implemented. If all characters come from the same writing system, the browser might reasonably assume that the registrant intended to use that script and display the Unicode characters without additional warnings.
Chrome responded to the 2017 attack by implementing skeleton-based detection. Chrome generates a simplified representation of each domain name that strips away script-specific features and compares the skeleton against a list of known legitimate domains. When a skeleton matches a protected brand name, Chrome displays the domain in Punycode form instead of the Unicode form.
The skeleton approach works well for common brand names that use Latin characters. It works less well for brand names that already use non-Latin scripts, for brand names that are short, or for brand names that use characters that have confusables in other scripts.
Era Three: Modern State (2020-2026)
Today, homograph attacks have evolved beyond the simple all-Cyrillic or all-Greek approaches. Attackers now use language-specific character sets that are not blocked by browser heuristics because they are from scripts that are not commonly used for spoofing. Spanish accented characters, Polish modified Latin letters, and Vietnamese script combinations can all be used to create convincing spoofs, and these techniques do not trigger the mixed-script alarms that Chrome uses.
Recent analysis by researchers at Red Siege documented several active campaigns using these techniques in 2026 Red Siege Blog - Ian Briley. The campaigns use domain names that look legitimate at a glance but contain subtle character substitutions that are not caught by browser defenses.
Akamai's security research has documented the growth of IDN-based phishing domains across their global network, tracking patterns in how attackers leverage internationalized domain names to evade detection Akamai Security Intelligence Group. The actual number of active homograph domains is likely higher than any single source reports because many attacks go unreported and because new domains are registered daily.
The modern attack landscape has several notable characteristics. First, attackers use domain name registrars in jurisdictions with minimal enforcement. Script restrictions and verification requirements create hurdles but not impassable barriers. Second, attackers register domains that are similar to legitimate brands but not identical. The domain microsft.com (note the space before "microsft") uses a different technique but serves a similar deception purpose. Third, attackers combine homograph attacks with other techniques including AI-generated phishing content, personalized targeting, and multi-channel campaigns that increase credibility.
Recent CVEs and Security Incidents (2025-2026)
The vulnerability research community has continued to identify and document homograph attack vulnerabilities in browser engines and domain registration systems. Browser vendors have addressed IDN spoofing vulnerabilities in recent years through their security update processes, though the specific CVE details for browser-level IDN bypasses vary by vendor and version.
The MITRE Corporation's CVE program provides standardized identifiers for publicly known security vulnerabilities. The development of new homograph-related CVEs has accelerated as researchers apply systematic fuzzing techniques to IDN parsing code across browsers. Chrome's vulnerability reward program has paid out significant bounties for IDN spoofing findings, demonstrating that the attack surface remains active and valuable to attackers.
Browser vendor security advisories from 2025 and 2026 reflect the ongoing nature of this arms race. Google's Chrome Stable Channel updates regularly include IDN-related security fixes. Mozilla's security advisories document Firefox updates that address IDN display vulnerabilities. Apple's Safari updates similarly include patches for IDN spoofing issues discovered by security researchers.
The security community's ongoing focus on homograph attacks demonstrates that these vulnerabilities are not theoretical. Each patch addresses real exploitation scenarios discovered by researchers who probe browser IDN handling systematically. The pattern of discovery, disclosure, patch, and bypass suggests that homograph attacks will remain a viable attack vector for the foreseeable future.
The 2026 threat landscape has seen several notable trends in homograph attack deployment. First, the use of punycode domains in phishing campaigns has increased according to analysis from the Anti-Phishing Working Group (APWG). Their public reporting has documented growth in IDN-based phishing domains in recent quarters. Second, threat actors have begun combining homograph attacks with lookalike character substitution in passwords and usernames, creating layered deception campaigns that are difficult to detect. Third, the emergence of new Unicode characters in recent years has introduced additional confusable pairs that researchers discovered were not yet incorporated into browser skeleton algorithms.
Why Technical Users Still Fall Victim
The Cognitive Science of URL Processing
The most counterintuitive aspect of homograph attacks is that users who understand them intellectually still fall victim to them. Security awareness training covers homograph attacks. Blog posts explain how they work. Yet users who can recite the definition of Punycode will still click on a link that looks like apple.com in their address bar and enter their credentials.
The reason is that human visual processing does not work the way we assume it does.
Cognitive research on reading shows that humans process text as visual wholes, not as sequences of characters. When you look at the word "apple", your brain does not individually identify the letters a-p-p-l-e and then check each one against your memory. Instead, your brain recognizes the overall shape of the word, matches it against your vocabulary, and makes a decision about meaning. This process is called parallel processing, and it is extremely efficient for tasks like reading familiar words in familiar contexts.
The efficiency of parallel processing is precisely why homograph attacks work. When you see apple.com in your address bar, your brain does not verify that each character is the specific Latin letter you expect. It sees a familiar word shape and accepts it as legitimate, and only if something feels wrong, like an unexpected character that disrupts the shape, does your brain switch to serial processing and examine individual characters.
This is why font rendering matters so much. Two visually identical character strings might encode completely different meanings depending on which Unicode characters were used. Your brain cannot tell the difference if the shapes are the same.
Research by Ian Briley at Red Siege has documented this phenomenon in modern contexts. Even security professionals who know about homograph attacks will sometimes fail to detect them in practice because the visual processing shortcut is so deeply ingrained Red Siege Blog - Ian Briley.
The Context Switching Problem
Technical users are not just vulnerable because of visual processing limitations. They are also vulnerable because of how they work.
Security professionals, developers, and IT administrators spend a significant portion of their day switching between contexts. They might be checking email, reviewing code, attending a meeting, and responding to a Slack message, all within the span of minutes, and each context switch creates a cognitive load that reduces attention to detail.
When you are in a task-switching context, your brain is not fully focused on URL verification. You are checking the link quickly enough to continue with the task at hand. You see the lock icon, you see the familiar word shape, and you proceed. This is not a failure of training. It is a predictable consequence of how attention works under cognitive load.
Homograph attacks are particularly effective against busy professionals because they exploit the shortcuts that efficiency requires. The attack does not need to fool a vigilant user. It only needs to fool a distracted user who is checking a link quickly enough to keep moving.
The Password Manager Problem
Many security-conscious users rely on password managers to protect against phishing. The logic is sound in theory. Password managers store credentials only for specific domains and will not autofill on unfamiliar sites. If a password manager refuses to autofill, the user should be suspicious.
In practice, password managers have a limitation that undermines their effectiveness against homograph attacks. Password managers verify domains by exact string match. They do not verify domains by visual equivalent or by punycode conversion. If you visit xn--h1alfa5a.com and have credentials stored for apple.com, the password manager will not autofill. But if you visit аpple.com (using a Cyrillic "а" that renders as the same visual shape), the punycode conversion might produce a domain that the password manager recognizes as apple.com depending on how the browser handles the IDN encoding.
This is not a flaw in password managers. It is a consequence of the fundamental mismatch between visual sameness and computational sameness. Password managers were designed to verify exact domain strings, not to understand Unicode normalization or visual equivalence.
The deeper problem is that the web security infrastructure was built on the assumption that domain names are ASCII strings. When you add Unicode to the equation, you create a situation where the same domain can have multiple valid ASCII representations (the punycode form) and multiple valid Unicode representations (the various confusable character combinations). Any security control that relies on exact string matching will eventually encounter edge cases where the string matches but the domain is not the intended one.
The Defense-in-Depth Playbook
Layer One: Registry and Registrar Controls
The outermost layer of defense against homograph attacks is the domain registration system itself. Registries and registrars have implemented various controls to limit the creation of deceptive domain names, but these controls are not uniform across all registrars or all TLDs.
Some generic TLDs like .com and .net have minimal script restrictions at the registry level. Other TLDs like country-code domains often have stricter requirements. Newer generic TLDs like .app and .dev have implemented IDN policies that restrict which characters can be registered.
The practical implication is that not all TLDs are equally safe from homograph attacks. If you are evaluating domain names for your organization, the TLD matters. A domain name that looks like a spoof in .com might be impossible to register in .app due to character restrictions.
For enterprise security teams, defensive domain registration is a practical option that many organizations overlook. Registering common homoglyph variants of your brand name proactively can prevent attackers from using them. This approach requires ongoing monitoring as new variants are constantly being created, but it is effective for high-value targets.
Layer Two: Browser Defenses
Browsers remain the primary technical defense against homograph attacks at the user level. Chrome, Firefox, and Safari all implement some form of IDN spoofing detection, but the effectiveness varies significantly.
Chrome's skeleton-based detection is the most sophisticated but also has the most documented blind spots. The algorithm works well for common Latin brand names but struggles with short domains, non-Latin brand names, and domains that use uncommon scripts. Chrome also sometimes fails to detect whole-script attacks that use only characters from a single permitted script.
Firefox's approach of defaulting to punycode display for uncertain IDN domains is more conservative but also more predictable. If you enable network.IDN_show_punycode in Firefox, you will see the punycode form for any IDN domain that might be deceptive. This is the most reliable way to protect yourself as an individual user, but it comes at the cost of losing the convenience of reading domain names in their native scripts.
Safari's approach is similarly conservative and breaks legitimate IDN usage for many non-English scripts, but provides strong protection against homograph attacks.
The key insight is that browser defenses are necessary but not sufficient. Even if Chrome perfectly detected all homograph domains tomorrow, users would still encounter older browsers, mixed rendering environments, and edge cases where the detection fails. Browser defenses reduce the attack surface. They do not eliminate it.
Layer Three: Certificate Transparency Monitoring
Certificate Transparency (CT) logs provide a powerful tool for detecting homograph domain registration. When a certificate authority issues a TLS certificate for a domain name, the certificate is recorded in public CT logs. Security teams can monitor these logs for certificates issued for domains that look similar to their brand name.
The technique is straightforward. If you own example.com, you might monitor CT logs for any certificates issued for domains like examp1e.com, example-co.com, or other variants that could be used for phishing. More sophisticated monitoring can include homoglyph variants, though this requires understanding which Unicode characters are confusable with your brand characters.
CT log monitoring catches domains after they are registered but before they are used in attacks. This timing matters because most phishing campaigns have a short window between domain registration and active use. By monitoring CT logs, you can detect and take action on malicious domains before they become operational.
Several commercial services provide CT log monitoring, or you can implement your own monitoring using open-source tools. The critical requirement is that monitoring be near-real-time because the attack window can be short.
Layer Four: DNS and Email Filtering
DNS filtering services can block access to known malicious domains, including homograph domains that have been reported to threat intelligence feeds. Similarly, email gateways can be configured to detect punycode patterns in sender domains and flag or block messages from suspicious sources.
The effectiveness of these controls depends on the quality of the threat intelligence they use. Homograph domains are often used once and discarded, which means that by the time a domain appears on a block list, the attack might already be over. Effective DNS and email filtering requires fresh threat intelligence and rapid updating of block lists.
For organizations that manage their own DNS infrastructure, configurable DNS blocking lists can provide an additional layer of protection. BIND, unbound, and other DNS servers support response policy zones (RPZ) that allow you to block domains based on various criteria.
Layer Five: User Behavior and Password Managers
At the individual level, password managers remain one of the most effective defenses against homograph attacks, despite the limitations described earlier. Password managers do not prevent you from visiting a malicious domain, but they do prevent you from accidentally entering credentials on the wrong site.
To maximize the effectiveness of password managers against homograph attacks, you should verify that your password manager is configured to require exact domain matching for credential autofill. Some password managers have options that relax this requirement for usability, and those options can reduce your protection against homograph spoofing.
Beyond password managers, the most important individual defense is to develop the habit of verifying URLs for high-value targets like banks, email providers, and critical business applications. This verification should be character-by-character for domains you visit frequently, and you should type critical URLs directly rather than clicking links when possible.
The assume-breach mindset is also valuable here. Treat any login page you navigate to via a link as potentially compromised. If you can verify the domain by typing it directly, do so. If you are asked to enter credentials after clicking a link, be extra vigilant about the domain name and consider whether the context makes sense.
What Actually Works
After reviewing the attack evolution, browser defenses, and cognitive factors, I want to be direct about what actually works for defending against homograph attacks.
The honest answer is that no single control is sufficient. Homograph attacks exploit the gap between visual perception and computational representation, and that gap exists at every layer of the web stack. You cannot fix it at the browser level without breaking legitimate IDN functionality. You cannot fix it at the registry level without restricting internationalization. You cannot fix it at the user level without changing how humans process text.
What works is layered defense. Browser defenses reduce the attack surface. Certificate transparency monitoring catches domains before they are used. DNS filtering blocks known malicious domains. Password managers prevent credential leakage. User awareness training helps but is not sufficient on its own.
The technical depth of this post might feel overwhelming if you are not a security professional, but the practical recommendations are surprisingly simple. Enable punycode display in your browser for maximum safety. Use a password manager and verify that exact-match domain verification is enabled. For high-value targets, type URLs directly rather than clicking links. If you manage domain names for an organization, register homoglyph variants proactively and monitor CT logs for lookalike certificates.
These controls are not new. They are not surprising. They are the same controls that security professionals have recommended for years. The difference is understanding why they work and why they are necessary.
Homograph attacks persist because they exploit fundamental aspects of human perception and because the Unicode standard continues to expand the character set that attackers can use. Every new Unicode version potentially adds new confusable characters that are not yet in detection databases. The arms race between defenders and attackers will continue as long as the visual deception remains possible.
The good news is that the attacks are also becoming more expensive to execute. Registry restrictions, browser defenses, and security awareness have raised the bar for successful homograph attacks. The easiest attacks no longer work. What remains are targeted attacks against high-value victims, which are more expensive to execute and more likely to be detected.
For most organizations and individuals, the practical risk from homograph attacks is manageable with basic controls. The risk is not zero, but it is manageable. What matters is understanding what those controls are, why they work, and where their limitations are.
What You Can Do Today
If you take one thing from this post, let it be this: "check the URL" is necessary but not sufficient advice. Visual URL verification is important, but it is also the step that homograph attacks are specifically designed to defeat.
Here is what I recommend you do today.
First, check your browser settings for IDN display options. In Firefox, set network.IDN_show_punycode to true. This will force Firefox to display punycode for all IDN domains that might be deceptive, which is the most reliable way to see exactly what domain you are visiting. In Chrome and Safari, be aware that IDN display behavior varies by script and font, and take extra caution with domains that use non-Latin characters.
Second, verify that your password manager requires exact domain matching for credential autofill. Check the settings and confirm that the security setting is enabled. If your password manager has an option to disable exact matching for usability, leave it disabled.
Third, if you manage domains for an organization, conduct a homoglyph variant audit. List the common ways an attacker could spoof your brand name using confusable characters, and check whether those domains are already registered. Register defensive variants for the most dangerous lookalikes, and set up CT log monitoring to detect new registrations.
Fourth, if you work in enterprise security, ensure that your phishing simulation programs include homograph attack scenarios. Many organizations test employees with standard phishing templates but do not test for IDN spoofing. Adding this vector to your simulation program will help identify gaps in awareness and controls.
These steps will not make you immune to homograph attacks. Nothing will. But they will reduce your exposure significantly and give you visibility into areas where your current controls might be insufficient.
Homograph attacks are a reminder that web security is built on assumptions that have been violated repeatedly. The assumption that domain names are ASCII strings is no longer true. The assumption that visual sameness implies computational sameness is no longer true. The assumption that users can reliably detect spoofed domains by visual inspection is no longer true.
Understanding these assumptions, and understanding where they break down, is what separates security professionals who understand their tools from those who merely deploy them. The goal of this post has been to help you understand.