Why does this happen:
> String.fromCharCode(0xd7FF)
''
> String.fromCharCode(0xd800)
'�'
> String.fromCharCode(0xdffe) // (and everything in between)
'�'
> String.fromCharCode(0xdfff)
'�'
> String.fromCharCode(0xe000)
''DFFF₁₆ is 55296₁₀. I get the same results with String.fromCodePoint().
1 Answer
Code points U+D800 to U+DFFF are reserved for the UTF-16 encoding of surrogates. Effectively, these are characters which are never valid individually - they always come in surrogate pairs - a high surrogate followed by a low surrogate. (Confusingly, the "high surrogate" range is the range U+D800 to U+DBFF, and the "low surrogate" range is the range U+DC00 to U+DFFF.)
This pair of characters is combined in UTF-16 to represent a single character outside the Basic Multilingual Plane.
Outside this special meaning in UTF-16, these aren't valid characters. So it's reasonable for String.fromCharCode to basically say "you haven't provided valid string data" and use the Unicode replacement character instead.