Back to Doug Ewell's Home Page

Scripts: A Quick Glimpse Into a Unicode Text File




Here’s a quick and easy way to check what scripts are represented by the characters in a Unicode text file. Scripts identifies the script of each character in accordance with Unicode Standard Annex #24 and displays the code of each script in the file, with the most frequently occurring scripts listed first.

Sample screen shot of Scripts

Scripts uses information from the Scripts.txt data file included as part of the Unicode Character Database, updated to Unicode 4.1, and referenced by UAX #24. It handles all “common” (Zyyy) and “inherited” (Qaai) characters as recommended in UAX #24.

Scripts can analyze multiple files (including wildcards) and automatically detects and analyzes files encoded in UTF-16 (big- or little-endian), UTF-8, and SCSU.

Scripts is completely free and was written in response to a request on the Unicode public mailing list. It runs on all 32-bit Windows systems, even on Windows 95 and 98 systems that do not directly support Unicode. Scripts runs in console mode (i.e. on the command line) and must reside in the current directory or in a directory in the search path.

You probably won’t use Scripts every day, but it’s a nice tool to have around.

Download Scripts here. For more information or assistance, please contact the author.


Copyright © 2002–2007 by Doug Ewell  •  Last modified 2007-05-20 Unicode Encoded Valid XHTML 1.0 Valid CSS