SOME ASPECTS OF MAKING
DICTIONARIES WITH SHOEBOX TOOLS
N. Nechukayeva, V. Tolstoy
Ukraine, Dnepr, NMetAU
This article focuses on researching method - program SHOEBOX which is applied
to MultiDictionary Formatter. The Formatter permits combining multiple
dictionaries if text translation is got from a file and the fields may be
changed right in the database.
It is known that modern
terms of theory and practice for compiling new dictionaries is called computer
"lexicography" itself is discipline is devoted to describing and
searching of semantic relationships within the one language vocabulary. Today
it is a young but demanded discipline including progress in theories of
dictionary formations and constituent elements together with linking the data
At the same time
practical lexicography is the craft of composing, writing and editing modern
The Shoebox program
considers three important aspects of composing the expository linguistics: text
collection, lexicon (vocabulary) and grammar.
is called lexicography represents a versatile process. It consists of a big
list of aspects. Some of them are the following ones:
understanding the functional, cultural, semantic
and structural areas;
structuring of information which means kinds of
information in a dictionary entry of word, codes, ordering the data info,
notation and other items;
inputting structured information, e.g. the
vocabulary database (usually it is a long process);
control and checking information in the lexical
processing the data for lexical purposes;
format for derivation and data amending;
printing of a new dictionary;
advertising and marketing;
distribution of dictionaries.
SHOEBOX program is a
utility which can operate with all these items. The MultiDictionary Formatter
(MDF) destination makes it possible to combine multiple dictionaries in one. It
is necessary notably for translations which come from files (on default) and
records may be changed just in the in database. MDF gives first priority
for vocabularies. When a scientist, teacher or student transfers dictionaries
with the help of constructor the dictionary with requested translation is the
first in relevance. Though one often faces the cases when dictionaries are
added later and appearing of a new one gives it the priority. Here we suppose
that the highest priority is the zero one. And the lower number is the highest
position it takes.
The comparing of word
processor text files created, for example, in the MS Word shows that the data
base structure of compiling dictionaries gives the real advantages in
controlling and changing the information. SHOEBOX program has new properties:
it makes the research and analyzes in large
lexical databases very fast and affordable;
use of non-adjacent entries and copying and
pasting information with the use of e.g. JUMP Number of Lexicographic Sums of
Ordered Sets ,which a user defines himself, helps to sort symbols such as
"o" followed by " ô" or "u" followed "ü"
and the possibility to manage with digraphs (sc, ng, ph, th, ti, sh, ch, etc.);
the help in checking for compliance with a
master list with the use of SHOEBOX ranges kits such as parts of speech. This
ensures quality control (QC) during compilation;
the ability to make search using different
databases. It is very important in comparing dictionaries based on the same
using the template to automatically inserting
the codes defined by the user in the new record. The lexical database is
interactive with a text corpus for dictionary constructing, spell‑checking
and search sentences as examples.
In total, all items
mentioned above give multiple language information with one-time updating. Texts
based on linguistics and study of lexicography ensure very solid basis for the
scheduling of language and culture.
SHOEBOX has filters
which may isolate or extract on the user's demand categories of information for
different purposes (semantic domain, parts of speech, origin, phraseology,
SHOEBOX gives a user the
possibility of systematically replacing and cancelling defined words, which are
units of sentences. The user also can change the style, font or anything else
at any moment. SHOEBOX tool being used in data bases allows to make a fairly
complicated finder list in a short time, from a few minutes to a few hours with
the help of MultiDictionary Formatter. In previous times it took weeks and
months of the thorough manual of word processor work. Formatting and printing a
dictionary was a long process demanding filling the dictionary entry , adding
the new information, showing the examples of use with different parts of speech
(using, for example, backslash "\" for dividing meanings). And
formatting the printed dictionary was a long and complicated way. So appearance
of MDF helps to put a bridge though the gap between compiling and printing and
it also represents a familiar to everyone dictionary consisted of two columns.
Menu "Format" allows by pressing the key F to change the data base
format into the customary two-column one. MDF prompts to answer some simple
questions and then, analyzing the answers, the new computer dictionary will
insert odd and even page footers which include not only the number of a page
and the letter but also includes following: the name of language, current date
and time, dividers into chapters with uppercase and lowercase letters between
each new chapter of entries. The program prompts questions on the display and a
user obtains to 16 various word combinations. At the same time the initial data
file is not changed.
Setting up data and
functions: morphological analysis, data coordination and its export to other
Карпова О.М. Английская лексикография. М.:
издательский центр «Академия», 2010.
David F. Coward, Charles E. Grimes A guide to
lexicography and the Multi-Dictionary Formatter, software version, SIL
International Waxhaw, North Carolina, 2000, 243 p.
N. Nechukayeva, V.Tolstoy. Computer corps
linguistics research on the base of the international corpus of English. Перспективні
28 міжнародної науково-практичної
науки – XXI сторіччя».
– Том 2. Природничі
та точні науки. – Видавництво ПГА. – Запоріжжя, 2014.