A tool to convert English text from one spelling system to another. At present, there are spelling files for american, british, and canadian spellings.
Perl 5, and ispell version 3.2.06.epa1 or later. This is an unofficial release of ispell made to incorporate the new -e5 expansion option: the code will be merged back into the main ispell tree when the maintainer has time.
Simply having a lookup table from one spelling convention to another is not enough. Often there are two words which, because of differences in meaning or in pronunciation, are spelt differently in one system but the same in another. This is most notable moving from british to american spelling: for example cheque/check -> check, curb/kerb -> curb, and many others. But there are examples in the other direction too, for example vice/vise -> vice, and analyses/analyzes -> analyses (where the difference is in pronunciation as well as meaning).
Instead we create one lookup table for each language from 'words' to one or more 'spellings' for each word. A 'word' is an uppercase key like ANALYZE or CHEQUE, and two words are separate if there is any spelling convention which assigns them different spellings. Then to convert from one spelling convention to another, we do a reverse lookup in the source spelling's table from each character string to its corresponding 'words' (which may be more than one), and then in the target's table we find the most common spelling for each word. When more than one word is involved, and the most common target spelling for these words differs, the user must be asked what the intended meaning (or pronunciation) was.
For example, suppose we wanted to translate 'prophesy' from american to british. Looking up in american reveals two words which could use that spelling:
PROPHECY: prophesy PROPHESY: prophesyNow looking up the two words (PROPHECY and PROPHESY) in british gives:
PROPHECY: prophecy PROPHESY: prophesySo the two possible choices are 'prophecy' and 'prophesy'. It's the user's job to pick between them.
In fact, the spelling files give several 'words' on each line, by using ispell-style expansion flags. The IspellExpand.pm module runs ispell with the new -e5 option to convert these to several words. This is why version 3.2.06.epa1 or later of ispell is needed.
Three spelling files are provided: american, british and canadian. These can be loaded by the Spelling.pm module and conversion tables can be built. Then there are two executables:
The program 'respell' is a filter. Give it two spelling files (from and to) and it will convert text from one to the other. When more than one possible output choice is possible, the several choices are included in the output inside square brackets. For example, [ prophecy prophesy ]. You can disable this, and just pick the most common target spelling, with the -f option.
Words which don't need changing, and nonword characters, are passed through unchanged. By default, respell will only deal with lowercase words. The -i option tells it to handle Capitalized words, and -I handles UPPERCASE words.
Finally, the -q flag suppresses most of the chatter.
A slightly more sophisticated interface to respelling documents. For speed, this doesn't use the Spelling module but instead prebuilt lookup files. You can build these files with 'make'.
You need to install respell.cgi on your web server together with the data files. There may be a live demo at the website for this project (see below).
Sorry, since the move to a new web server the live demo no longer works. I hope to have it back up soon.
Currently, there is no 'make install' mechanism. You can either run the programs from the directory where they were unpacked, or copy the executables and .pm files somewhere suitable.
'make' will build some data files needed for respell.cgi. 'make test' will check that the conversion tables are as expected. There are corresponding 'make full' and 'make test_full' targets for an exhaustive set of files converting every possible spelling to every other.
The varcon table has the same purpose as this project, but it's inadequate because it doesn't handle one spelling mapping to two or more. It should however be possible to generate a new varcon list from this project's data files.
Ed Avis, email@example.com. See the file COPYING for copying conditions.
This project has a web page at http://membled.com/work/apps/respell/>.