John Kovarik

Kovarik Consulting

Click here for my resume.

Direct E-Mail to:

Independent Language Consultant specializing in NLP applications software --MA in Chinese Language--University of Wisconsin, Madison-- Member AMTA, ACL,

ACL-CLP (The Association for Computational Linguistics and Chinese Language Processing), and

CIPS (The Chinese Information Procesing Society).

Note: These pages are coded in simplified Chinese (GB-2312-80). (Click here for traditional Chinese.).

Major Linguistic Areas of Interest Abstracts & Publications



Major Areas of Interest

Chinese Treebanking

Chinese Parsing and Natural Language Procesing

Third Order Markov Model Chinese Text Generation

Chinese-English Lexical Conceptual Structures at UMd

Chinese-English Semantics

Evaluation of a Language Model for Chinese

Developing Guidelines and Ensuring Consistency for Chinese Text Annotation

Extending a CFG Parser to Mongolian

Chinese Historical Linguistics



Abstracts & Publications

1969 A Study of Poetic Dialects in the Regulated Verse Rhyming Categories of Four Late Tang Poets of the Early Ninth Century


1997 On Developing a Chinese Language Model for Explanatory Adequacy, JHU.

1998 On Preparing Chinese Text for Natural Language Procesing, UPenn.

Later as a language researcher on loan to UMd, I also contributed to "Translating English and Mandarin Verbs with Argument Structure (Mis)matches using LCS Representation", UMIACS 1998 Mari Broman Olsen, published in the Proceedings of the 1998 SIG-IL Conference.

Additionally I contributed to the "University of Maryland LCS-based Chinese-English MT System" at the Natural Language Pacific Rim Symposium's Multilingual Asian Language Workshop (MAL99) in Beijing.

1999 Computing Minimal Semantic Distance Between Noun Pairs in a Chinese Thesaurus and in the English WordNet, NLP Pacific Rim Symposium, Beijing.

2000 As a language researcher on loan to UPenn, I contributed to "Developing Guidelines and Ensuring Consistency for Chinese Text Annotation" presented at the second International Conference on Language Resources and Evaluation.

For ACL2000 in Hong Kong I researched how best to build a treebank. How Should a Large Corpus Be Built?--A Comparative Study of Closure in Annotated Newspaper Corpora from Two Chinese Sources, Towards Building A Larger Representative Corpus Merged from Representative Sublanguage Collections. or you can access a pdf-version here or view directly the Perl Script with three reference files (, and, which I wrote for this study. Or you can view my slides.

2003 As a language researcher learning a new language, I began development of a CFG parser for Mongolian and published my research on my Internet web page.

2004 As a federal government language technologist, I briefed the 25th Unicode Conference on "Building the Federal Multilingual Infrastructure in Unicode: Foreign Language Dictionary Tools".

2005 As a computational linguist at CALICO '05 in May, I briefed my latest work including a Mongolian morphological analyzer called in a presentation entitled "The Challenges of Learning and Sharing Knowledge of an LCTL in the 21st Century".

2009 Studies in Chinese Historical Linguistics