canonicalize unicode identifiers

As discussed on [the mailing list](https://groups.google.com/d/msg/julia-users/ZyuSbZKTw1g/KBtb1ILtUEUJ), It is very confusing that

```
const μ = 3
µ + 1
```

throws a `µ not defined` exception (because unicode codepoints 0x00b5 and 0x03bc are rendered almost identically).  This could easily be encountered in real usage because option-m on a Mac produces 0x00b5 ("micro sign"), which is different from 0x03bc ("Greek small letter mu").

It would be good if Julia internally stored a table of [easily confused Unicode codepoints](http://www.unicode.org/Public/security/revision-05/confusables.txt), i.e. homoglyphs, and used them to help prevent these sorts of confusions.  Three possibilities are:
- `foo not defined` exceptions could check whether a homograph of `foo` is defined and let the user know if so.
- Julia could issue a warning if a non-canonical homoglyph is used in an identifier.
- Simply canonicalize all homoglyphs in identifiers (so the users can type them any way they want, but they are treated as equivalent identifiers).

My preference would be for the third option.  I don't see any useful purpose being served by treating `μ` and `µ` as distinct identifiers.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

canonicalize unicode identifiers #5434

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

canonicalize unicode identifiers #5434

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions