Unicode Support in Epsilon 14

In regular expressions, <hspace> and <wspace> now match Unicode's various space-like characters. As a result, the commands delete-blank-lines, delete-horizontal-space, and others now treat Unicode's various space-like characters as types of spaces.

Epsilon's show-spaces mode (toggled by change-show-spaces on Shift-F6) was improved to display a variety of space-like and zero-width Unicode characters more clearly. The hex display mode in set-show-graphic (Ctrl-F6) now displays hex codes for Unicode characters too.

The message to choose an encoding because a file contains Unicode characters now shows the names of the characters and uses a clearer format.

The new command unicode-convert-to-ascii replaces some Unicode and Windows-1252 punctuation characters in the buffer (or highlighted region) with their nearest ASCII equivalent. It converts various quote-like characters to ' or ", various hyphen-like characters to -, and so forth, though it does not try to strip accents from letters.

The clipboard-convert-unicode variable recognizes a new bit. Setting bit 4 tells Epsilon to convert Unicode clipboard characters being pasted into an 8-bit buffer only if they have a corresponding ASCII (range 0-127) equivalent, and this is now the default. If you prefer the previous behavior, you can set this variable to 3, which makes Epsilon also convert certain Unicode characters with no ASCII equivalent to Extended ASCII (128-255, Windows-1252), and perform the reverse conversion when putting text back on the clipboard.

Epsilon's Unicode data (character names, for example) was updated to Unicode 9.0.

The show-point command (Ctrl-x =) when at a Unicode surrogate character now displays the name of the complete character.

Previous Up Next
Other Enhancements in Epsilon 14 Changes from Older Versions EEL Programming Changes in Epsilon 14