Field Linguist's Toolbox

Language Data Management and Analysis


Toolbox is a data management and analysis tool for field linguists. It is especially useful for maintaining lexical data, and for parsing and interlinearizing text, but it can be used to manage virtually any kind of data. Toolbox is free to download and use.

Toolbox is especially powerful for handling large corpora and dictionaries. It can handle a corpus of over a million words, and a dictionary of hundreds of thousands of words.

Although Toolbox is very powerful, it is designed to be easy to learn. The user can start with a simple standard setup and gradually add the use of more powerful features as desired. The Toolbox downloads include a training package that is usable for self-paced individual learning as well as for classroom teaching of Toolbox.

See the Online Tlingit Verb Dictionary for an example of a web dictionary made with Toolbox.

Full design flexibility

At heart, Toolbox is a text-oriented database management system (DBMS) with added functionality designed to meet the needs of a field linguist. The underlying DBMS offers full user flexibility in the design of any type of database. But for ease of use, the Toolbox package includes prepared database definitions for a typical dictionary and text corpus.

Browse view

The Toolbox DBMS offers powerful functionality like customized sorting, multiple views of the same database, browse view to show data in tabular form, and filtering to show subsets of a database. It can handle any number of scripts in the same database. Each script has its own font and sorting characteristics. While Unicode is preferred, Toolbox can handle scripts in most legacy encoding systems.

Web dictionary

Toolbox also has powerful linguistic functionality. It includes a morphological parser that can handle almost all types of morphophonemic processes. It has a word formula component that allows the linguist to describe all the possible affix patterns that occur in words. It has a user-definable interlinear text generation system which uses the morphological parser and lexicon to generate annotated text. Interlinear text can be exported in a form suitable for use in linguistic papers. Toolbox has export capabilities that can be used to produce a publishable dictionary from a dictionary database.