History of SILKin

In 1996, Gary Morris was approaching retirement from the IRS AI Laboratory. He felt a call to retire from IRS early and return to Penn where he had begun an AI PhD program 10 years earlier, but not finished. He felt God wanted him to do a dissertation in support of Wycliffe Bible Translators, although he knew very little about them.

After some searching, he made contact with Dr. Gary Simons, then Vice President for Academic Computing at the Summer Institute of Linguistics (SIL), the academic arm of Wycliffe. SIL had been praying for some time for help with a Machine Learning problem: Kinship Analysis. Machine Learning was Gary Morris's original focus, so Dr. Simon invited him to investigate the problem.

Gary attempted to take early retirement in 1997, and traveled to SIL's headquarters at Camp Wisdom, in Dallas, TX to study the problem domain. He enrolled in Dr. Tom Headland's introductory class in Anthropology for Field Translators and had many fruitful discussions about what field translators actually encounter when first learning a target language. It was Tom Headland who contributed the key insight that non-genealogical factors are often important to study.

While at Camp Wisdom he interviewed several seasoned field workers who had returned for further training. He was able to document the kinship systems they had studied and learn how cultural concepts and kinship systems interact.

Gary's retirement was denied, so he had to return to Washington DC and finish his IRS career, but he had enough background to begin planning a dissertation project in Kinship Analysis.

He successfully retired in 2000 and reported immediately to Penn where he resumed his PhD studies. Penn required him to pass comprehensive exams again (he had passed them in 1987) but by 2002 he had met all requirements and began full time work on the problem.

System Design

Guided by his thesis advisor, Professor Lyle Ungar, and Penn's chairman of Anthropology, Professor Greg Urban, he developed the basic architecture of SILKin (SIL Kinship). The anthropology literature contained hundreds of analyses of kinship systems from around the world, so SILKin was designed in part to use transfer learning. That strategy tries to solve a problem in a new domain by looking for similarities to problems in other domains that are thought to have parallels with the domain of interest. In this case, the domain of interest would be a new language or culture (termed a context in SILKin). Cultures (contexts) for which the kinship system is well understood would then be similar domains. The goal would be to find learned concepts in the well-understood domains that could be transfered to the target domain (new context). This strategy would exploit the finding that certain patterns of kinship are found in multiple cultures around the world.

Although there were some strong conventions in anthropology for documenting the structure of a kinship system, there was no single lingua franca with a strict syntax for expressing kinship patterns. However, computer science had a suitable structure that could capture the logic of kin term definitions: Horn Clause logic. This is a "rule language" that is easily understood by humans and efficiently computed by machines. It is the basis for one of the earlier Artificial Intelligence languages: PROLOG.

Gary devoted two years of his dissertation research to building a library of already-solved kinship systems expressed in Horn Clauses. He ended up with 50 contexts (cultures) that represented a wide variety of nations and people groups. It contained examples of every relationship pattern that has been observed to cross over between domains.

Horn Clauses have a strict syntax and broad expressive power. Also, through the use of auxiliary clauses it is possible to express a complex kinship pattern very succinctly. This makes Horn Clause logic very handy for humans, but it creates a problem when comparing the definitions of kin terms from different contexts. [The person who utters or uses a kin term is called Ego and the person referred to is called Alter.] The following example illustrates the problem:

One person might describe a kin relationship between Ego and Alter this way:

Alter is Ego's female parent's female parent's female child (who is not Ego's parent).

A different person might describe the same relationship using previously-defined kin terms that are common in all contexts:

Alter is Ego's mother's sister.

Not only is the second definition more compact, but it is also more precise if the definition of sister specifies that your sister shares not only your mother, but also your father. The first definition leaves that ambiguous.

The biggest problem, however, is that looking at two different sets of Horn Clauses it is difficult to tell if they describe the same relationship. Fortunately, any Horn Clause can be converted to a graph (in the computer science sense of that word). Such a graph is called a family tree chart by anthropologists. Two sets of Horn Clauses may look dissimilar, but if they each generate the same family tree chart, then they describe identical kinship relationships. (We assume that if a non-genealogical trait — like seniority — is included in the definitions, then that trait is reflected in the family tree chart.)

For this reason, SILKin adopted Horn Clauses as the internal representation of kin term definitions. It converts them to family trees whenever it must compare two sets of clauses to determine similarity.

Learning Strategies

Although Machine Learning was (and is) enjoying a resurgence of popularity, almost all the progress and interest was focused on learning from massive data sets. Data Mining was the newest buzzword and the techniques associated with trolling huge hoards of data were dominating AI research. At the time Gary attended UPenn, data mining was showing exciting results in medicine, business, finance, and genomics. Virtually all the tools and research were based on statistics, most notably Neural Nets and Deep Learning.

However, none of this was applicable to the domain of Kinship Analysis. Instead of massive databases, kinship analysis is based on individual dyads gained from face-to-face interviews by a field worker. A collection of several hundred dyads would be considered quite large — and expensive. An interview might require travel by the field worker to a remote location over primitive roads. Interviews must be conducted in the target language or a trade language that is almost never native to the interviewer, and often the field worker must study the language for months before conducting the first interview. On small data sets, statistics are of little value, and gathering sufficient data to enable a statistical approach is out of the question.

Thus, the appropriate machine learning strategies for kinship were the older methods that handle small data sets, without statistics. One attractive option was Active Learning that assumes a human oracle can provide data that is requested by the machine learning tool. A human-machine partnership makes sense for a field worker, and lends itself to a system that strives to minimize the cost of data acquisition (i.e. the number of interviews). SILKin adopted Active Learning as the primary learning strategy.

In addition to a human oracle, SILKin has another unusual asset: the Library of known kinship systems. The goal of Transfer Learning is to speed the learning of a concept or task in a new domain by adapting what has been learned in other, similar domains. If we view each language/culture/context as a separate domain, then there are many concepts that transfer easily from one domain to another. As an obvious example, the concepts of brother and aunt transfer perfectly among French, English, Spanish, etc. If those were universal concepts (as was once thought), then kinship analysis would be a simple matter of learning the words in the target language that correspond with each universal concept. That is not the case, but there are similarities within groups of languages. Discovering and exploiting those similarities could speed the learning process considerably, so Transfer Learning became another pillar of the SILKin design.

The academic goal of any dissertation is of course to contribute a new idea or technique to the literature. In the case of SILKin, the contribution was Active Error Correction. Errors are an issue in any machine learning system because data is almost never 100% correct. Every system makes allowance for noise in the data. In the context of kinship analysis, noise could be a misspelled kin term or assignment of the wrong kin term to a dyad. In statistical schemes, the only option is to ignore a certain percentage of the data that does not fit the concept being learned. But in Active Learning, the obvious strategy is to ask the human oracle to double-check a problematic dyad or to confirm that multiple kin terms might apply to a particular relationship.

SILKin uses Active Error Correction to eliminate data errors and also to confirm (or deny) the existence of synonym and umbrella terms. If half the dyads for female siblings are labeled 'sister' and half 'sis' there are two possibilities: either there is some hidden property that explains the difference or the two terms are synonymous. If the field worker confirms that two terms are synonyms, SILKin can then merge the two sets of dyads and proceed to learn with clean data. Or the field worker may investigate the difference between the two terms and discover a novel factor, perhaps a non-genealogical one.

Likewise, if the field worker confirms an umbrella term (i.e. 'grandparent') that overlaps with other, more specific terms, SILKin can learn that more than one kin term can apply to certain relationships.