NAME mdicm - main dictionary maintainance tool for Kana-Kanji converion SYNOPSIS mdicm [ command ] [ arguments ] AVAILABILITY SUNWjc0u DESCRIPTION mdicm maintenances a main dictionary for cs00 Kana-Kanji conversion. It has also some functions for an user diction- ary which is used for maintenance of a main dictionary. Started with command, mdicm runs in non-interactive mode. Started without command, mdicm shows the prompt mdicm> and runs in interactive mode. With `quit' command, you can ter- minate mdicm. See "COMMANDS and ARGUMENTS" section below for more detail. COMMANDS and ARGUMENTS Each function of mdicm will be performed by specifying a command and some arguments. mdicm support the following com- mands; mshow, show, ladd, ldel, cat, create, extract, quit, hinshi, ?, help How to specify each command is listed below. In the follow- ing, main_dict means a main dictionary, and user_dict, an user dictionary. See "File Formats" section below about synopsis, word, reading and part-of speech symbols of a words-list file. filename ] [ -k ] mshow main_dict user_dict [ -s start ] [ - e end ] [ -f Display contents of the main dictionary. Read the words registered in specified area, format them in a words-list, and put it to the specified file. Argu- ments are as follows. -s start Specify the reading of a word at which start to list. Without this argument, from the first word it will be started. -e end Specify the reading of a word at which end to list. Without this argument, at the last word, it will be done. -f filename Specify the words-list file to which the results are put. Without this argument, it will be put into standard output. -k Print the part of speech information in the name of part-of-speech. See mdicm(1) in ja locale for an example. show main_dict user_dict [ -s start ] [ -e end ] [ -f filename ] [ -k ] Display contents of the user dictionary. Read the words registered in specified area, format them in a words-list, and put it to the specified file. Argu- ments are as follows; -s start Specify the reading of a word at which start to list. Without this argument, from the first word it will be started. -e end Specify the reading of a word at which end to list. Without this argument, at the last word, it will be done. -f filename Specify the words-list file to which the results are put. Without this argument, it will be put into standard output. -k Print the part of speech information in the name of part-of-speech. Without this argument, it is printed in Part-of-Speech-Symbols. ladd main_dict user_dict filename [ -l logfile ] [ -m new_main_dict ] [ -u new_user_dict ] Add multiple words at one time to the main dictionary. Arguments are as follwos. -l logfile Put the results of registration to logfile. No logfile is created without this argument. -m new_main_dict Specify the main dictionary to create. No argu- ment, it is created in the name newmdic in the current directory. -u new_user_dict Specify the user dictionary to create. No argu- ment, it is created in the name newudic in the current directory. filename Specify the words-list file in which words to add are listed. ldel main_dict user_dict filename [ -l logfile ] [ -m new_main_dict ] [ -u new_user_dict ] Delete multiple words at one time of the main diction- ary. Opeions are as follows. -l logfile Put the results of registration to logfile. No logfile is created without this argument. -m new_main_dict Specify the main dictionary to create. No log- file, it is created in the name newmdic in the current directory. -u new_user_dict Specify the user dictionary to create. If miss- ing, it is created in the name newudic in the current work directory. filename Specify the words-list file in which words to delete are listed. cat main_dict user_dict filename [ -l logfile ] [ -m new_main_dict ] [ -u new_user_dict ] Merge another user dictionary to the main dictionary. Argu- ments are as follows. -l logfile Put the result to logfile. No logfile is created without this argument. -m new_main_dict Specify the main dictionary to create. No argu- ment, it is created in the name newmdic in the current directory. -u new_user_dict Specify the user dictionary to create. No argu- ment, it is created in the name newudic in the current directory. filename Specify the dictionary's filename which is to be merged to the main dictionary. file6 [file7] create new_main_dict new_user_dict file1 file2 file3 file4 file5 create new_main_dict new_user_dict filename The first synopsis creates a new main dictionary from the files 1-6 and a new user dictionary from the words-list file ( file7 ). If file7 is not given, an empty user dictionary is created. The files 1-7 are as follows. suffix file1 Jiritsugo File (.jir) file2 Conjunction matric file (.set) file3 Clause Terminator file (.bun) file4 Conjugation file (.kat) file5 Suffix file (.sbi) file6 Fuzokugo file (.fuz) file7 Words list of user dictionary (.usr) The second synopsis creates a new main dictionary and a new user dictionary from the files whose name consist of filename followed by the above seven suffixs. All files with suffix such as filename.jir should be in the same directory. NOTE: Formats of the files or words-list are described in "File Formats". file6 file7 extract main_dict user_dict file1 file2 file3 file4 file5 extract main_dict user_dict filename The first synopsis extracts the contents of a main dictionary to the files 1-6 and those of user dictionary to the words-list file ( file7 ). The second synopsis extracts the files whose name consist of filename followed by the above seven suffixs, from a main dictionary and an user dictionary. NOTE: Formats of the files or words-list are described in "File Formats". quit Quit mdicm (interactive mode). hinshi Show the list of part-of-speech-symbols to stdout. ?, help Help. Show the command reference to standard out. File Formats Words list file The format of the words-list for input/output of each command. Comments Lines starting with "#" are comments. Data Consists of three fields. The first and second fields are for reading and word , respectively. And the last one is the part-of-speech informa- tion, described as an enumerate of part-of- speech-symbol. These fields are separated by half-size Katakana (Hankaku), white sapces or tabs. An example is shown on the mdicm(1M) for locale ja (japanese). Reading 12 Hiragana characters defined in Japanese EUC Codeset 1 can be used. However, "you-on" (such as 'xya'), 'wi', 'we', 'wo' and 'nn' aren't permitted, as the first character. For the second or subsequent characters, "cho-on" ('-' in Japanese EUC Codeset 1) can be used in addi- tion to all Hiragana characters. "daku-on" and "handaku-on" (such as 'da' and 'pa') are treated as two characters. Word Eight characters defined in Japanese EUC Codeset 1 can be used. Part-of-Speech-Symbols The part-of-speech information consists of the following part ot speech symbols. Symbols Part of speech Remarks :N1 noun1 general noun :N2 noun2 pronoun :M1 person's name1 family-name :M2 person's name2 first-name :T1 place name1 :T2 place name2 Names of prefectures :NM numeral :NN supplemental numeral Mai(pieces), Kai(times), Nen(years), etc. :PR prefix :SF suffix :AD adverb :CN conjunction :RT participial adjective :AJ adjective :AV adjective verb :SH S-series irregular con- jugation verb (Sahen- Doushi) :ZH Z-series irregular con- jugation verb (Zahen- Doushi) :1V Single conjugation verb :KV K-series five conjuga- tion verb (Kagyou- Godan-katsuyou-Doushi) :GV G-series five conjuga- tion verb (Gagyou- Godan-katsuyou-Doushi) :SV S-series five conjuga- tion verb (Sagyou- Godan-katsuyou-Doushi) :TV T-series five conjuga- tion verb (Tagyou- Godan-katsuyou-Doushi) :NV N-series five conjuga- tion verb (Nagyou- Godan-katsuyou-Doushi) :BV B-series five conjuga- tion verb (Bagyou- Godan-katsuyou-Doushi) :MV M-series five conjuga- tion verb (Magyou- Godan-katsuyou-Doushi) :RV R-series five conjuga- tion verb (Ragyou- Godan-katsuyou-Doushi) :WV W-series five conjuga- tion verb (Wagyou- Godan-katsuyou-Doushi) :UN No Classification :TK single kanji :BS clause file1: Jiritsugo file A reading of Jiritsugo in a word part, a part of speech for reading, Kanji, a part of speech for Kanji, a reading of Kanji part or Kanji is specified in each line. Each line has a list of readings and candidates for words for the Kanji part. For a word part, RRR...R XXXXXXXX JJJJJJJ....JJJJ YYYYYYYY For a Kanji part, RRR....R:XXXXXXXX KKKKKKKKKKKKKKKKKKKKKK RRR...R Reading (Kana Code) XXXXXXXXA Part of speech for reading (expres- sion in 8 digits HEX number) JJJJJJJ...JJJJ Kanji (Japanese EUC Code) YYYYYYYY A part of speech for Kanji (expres- sion in 8 digits HEX number) KKKKKKKKKKKKKKKKKKKKKK A Kanji character (string of Kanji (max 255)) (colon) Delimiter to distinguish the homonym and a Kanji character. (space) To separate each item, a half-width space is used. file2: Conjunction matrix file A dimensional array (32 columns by 182 rows) of HEX numbers. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX : : XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX X (an expression of 1 HEX digit) file3: Clause Terminator file An expression of 46 HEX digits XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXY X (expression of 1 digit HEX number) Y (expression of 1 digit HEX; the lower 2 bits of 4 bits are not used) file4: Conjugation file Following a reading of Conjugation, a repetition of line number is specified in a line. RRR....R LLL LLL LLL LLL LLL *LLL LLL LLL LLL LLL / RRR...R Reading (Kana Code; if no reading, uses *) LLL Line number (max 3 digits decimal number) / Delimiter used to separate each of 15 types of Conjugation specified in the fol- lowing order: 1. Adjective 2. Adjective verb 3. No classification 4. S-series irregular conjugation verb (Sahen-Doushi) 5. Z-series irregular conjugation verb (Zahen-Doushi) 6. Single conjugation verb 7. K-series five conjugation verb (Kagyou-Godan-Doushi) 8. G-series five conjugation verb (Gagyou-Godan-Doushi) 9. S-series five conjugation verb (Sagyou-Godan-Doushi) 10. T-series five conjugation verb (Tagyou-Godan-Doushi) 11. N-series five conjugation verb (Nagyou-Godan-Doushi) 12. B-series five conjugation verb (Bagyou-Godan-Doushi) 13. M-series five conjugation verb (Magyou-Godan-Doushi) 14. R-series five conjugation verb (Ragyou-Godan-Doushi) 15. W-series five conjugation verb (Wagyou-Godan-Doushi) file5: Suffix file A reading, type and Kanji of the suffix is described in a line. RRR...R JJ * RRR...R Reading (Kana Code) N Type (7 types with values(1, 2, 3,..., 7)) JJ Kanji (max2 Kanji characters; when Kanji is 1 character, the latter half has NULL+NULL) * When reading, type and Kanji are not specified. file6: Fuzokugo file Following a reading of Fuzokugo, a repetition of column number and row number of Conjunction matrix is specified in a line. RRR...RR ccc,rrr ccc,rrr ccc,rrr ccc,rrr...ccc,rrr *ccc,rrr ccc,rrr ccc,rrr ccc,rrr...ccc,rrr / RRR...R Reading (Kana Code; when no reading, use *) ccc Column number (max 3 digits decimal number) rrr Row number (max 3 digits decimal number) (comma) A pair of row and column numbers is separated by a comma. / Terminator when neither reading, nor row and column numbers is specified. FILES /usr/bin/mdicm SEE ALSO udicm(1), cs00(1M)
Закладки на сайте Проследить за страницей |
Created 1996-2024 by Maxim Chirkov Добавить, Поддержать, Вебмастеру |