You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature Request: Detect and Flag Non-ASCII Characters in Identifiers
Summary
Add a StyleCop rule (or rules) to detect and flag identifiers that contain non-ASCII characters (e.g., Greek, Cyrillic), which can be visually indistinguishable from standard Latin letters.
Use Case / Motivation
When coding in C#, developers sometimes inadvertently switch keyboard layouts (e.g., to Greek) and end up typing characters that look identical to standard Latin letters but are actually different Unicode code points. For instance:
publicinterfaceΙMyService// 'Ι' here is Greek capital Iota (U+0399){// ...}publicclassIMyService:ΙMyService// This won't compile as expected{// ...}
It’s very easy to end up troubleshooting odd compile errors or references not matching, only to discover a single character is from the wrong alphabet.
A StyleCop rule that flags these occurrences would provide immediate feedback to developers, preventing such subtle bugs.
Proposed Solution
New Rule:
ID: Suggest something like SA???? (whatever fits StyleCop’s numbering scheme).
Name: “IdentifiersMustUseAsciiCharacters”
Category: “Naming” or “Maintainability.”
Severity: Configurable; default to Warning.
Behavior:
For each identifier (class, interface, method, property, field, local variable, parameter, etc.), scan the text for any character outside the ASCII range (> 0x7F).
If found, report a diagnostic indicating which identifier is problematic.
Configuration:
Allow users to set whether they want to disallow all non-ASCII characters or only certain sets of known homoglyphs (e.g., Greek, Cyrillic, etc.).
Possibly allow ignoring some characters if needed for legitimate non-English names (but that might be out of scope for a first pass).
Rationale:
This rule prevents confusion caused by visually identical but semantically different characters, saving time and reducing friction during development.
Many teams adopt “English-only identifiers” as a best practice to avoid these pitfalls, so providing built-in enforcement aligns with real-world usage.
Potential Implementation Details
Roslyn:
A SyntaxNode or SyntaxToken analysis hooking into SyntaxKind.IdentifierToken.
Something like: “Identifier {0} contains non-ASCII characters and may cause confusion.”
Example:
publicvoidΜyMethod()// This 'Μ' might be Greek capital Mu{}
The analyzer would produce a warning explaining that the identifier is using a non-ASCII character.
Benefits
Immediate Feedback: Prevents confusion from near-homoglyphs that can break references or cause subtle bugs.
Aligns with Common Practices: Many coding standards advise using only ASCII for public-facing identifiers.
Minimal Overhead: Implementation is straightforward (simple character check).
Highly Configurable: Could provide toggles or whitelists for teams who need exceptions.
Possible Downsides or Considerations
Legitimate Use of Non-ASCII: In some projects, non-English words or domain-specific terminology might be intentionally used. A global rule might cause false positives.
Mitigation: Provide .editorconfig or rule settings so the user can suppress or allow certain code blocks or whitelisted characters.
Thank you for all the great work on StyleCop Analyzers. We’d love to see this feature to help developers avoid tricky unicode/homoglyph issues in their day-to-day C# projects.
The text was updated successfully, but these errors were encountered:
Feature Request: Detect and Flag Non-ASCII Characters in Identifiers
Summary
Add a StyleCop rule (or rules) to detect and flag identifiers that contain non-ASCII characters (e.g., Greek, Cyrillic), which can be visually indistinguishable from standard Latin letters.
Use Case / Motivation
When coding in C#, developers sometimes inadvertently switch keyboard layouts (e.g., to Greek) and end up typing characters that look identical to standard Latin letters but are actually different Unicode code points. For instance:
It’s very easy to end up troubleshooting odd compile errors or references not matching, only to discover a single character is from the wrong alphabet.
A StyleCop rule that flags these occurrences would provide immediate feedback to developers, preventing such subtle bugs.
Proposed Solution
New Rule:
SA????
(whatever fits StyleCop’s numbering scheme).Behavior:
> 0x7F
).Configuration:
Rationale:
Potential Implementation Details
SyntaxKind.IdentifierToken
.{0}
contains non-ASCII characters and may cause confusion.”Benefits
Possible Downsides or Considerations
Thank you for all the great work on StyleCop Analyzers. We’d love to see this feature to help developers avoid tricky unicode/homoglyph issues in their day-to-day C# projects.
The text was updated successfully, but these errors were encountered: