Analytic Curriculum Vitae

1. Introduction

Peter Scharf currently teaches and directs research in Indian linguistics and digital humanities as a Visiting Professor at International Institute of Information Technology in Hyderabad. He taught and directed research in Sanskrit and Indian linguistics as a Visiting Professor in the Department of Humanities and Social Sciences at the Indian Institute of Technology Bombay for the previous three years, and is invited as a Fellow at the Indian Institute of Advanced Study in Shimla next year. He has nineteen years of experience teaching Sanskrit previously at Brown University and fourteen years of experience developing innovative research and instructional technology for Sanskrit as founder and director of the Sanskrit Library (

Scharf earned his B.A. in philosophy at Wesleyan University and his doctorate in Sanskrit at the University of Pennsylvania, after which he taught Sanskrit and Indian literature at Brown University for 19 years where he was promoted to senior lecturer and served as concentration advisor and chair of the South Asian Studies Committee. Since 2011, he held several visiting professorships: Visiting Professor at the Maharishi University of Management Research Institute, Chaire Internationale de Recherche Blaise Pascal at the University of Paris Diderot, Visiting Professor of Sanskrit in the Department of Humanities and Social Sciences at the Indian Institute of Technology Bombay, and Visiting Professor in the Department of Sanskrit Studies at the University of Hyderabad. He is also the director of the Sanskrit Library which he founded in 2002. While his research focuses on the linguistic traditions of India, Vedic Sanskrit, and Indian philosophy, he has devoted considerable attention over the past several years to Sanskrit computational linguistics and building a digital Sanskrit library ( Sanskrit Library projects provide internet access to Indic lexical resources, texts, and manuscripts. His current research brings the linguistic traditions of India face to face with contemporary formal linguistics. He is developing computational implementations of Pāṇinian grammar, a morphologically and syntactically tagged corpus of Sanskrit texts and is investigating the use of Pāṇinian models of verbal cognition in computational syntax.

2. Priorities

Real progress in the discovery and propagation of knowledge in the humanities demands a predominant place for the fundamental disciplines of philology and archaeology. These disciplines are the foundation on which cultural studies is built. Systematic interpretive methods must enrich these disciplines; yet one must guard against them distracting attention from substantive issues and from retreating into empty formalism. As one of the world’s richest cultural heritage languages, and the primary cultural heritage language of South Asia, a predominant place should be recognized in the curriculum for Sanskrit.

Methods of research and distribution of knowledge need to keep step with available technology in order to be found relevant. Digital technology can enrich traditional philology by collecting, organizing, and presenting information to scholars in ways that may facilitate new insights. Scharf’s research and teaching utilize the best available methods from the ancient to the cutting edge.

3. Current research

Modern formal and computational linguistics was dominated by English at its inception and developed in subsequent decades primarily in the environment of European languages. More recently there has been a concerted effort to undertake formal linguistic analysis of a wide variety of languages, with particular interest in those with dramatically different features, and to enrich linguistic theory to account for linguistic variety. In spite of this effort, analytic structures and procedures utilized in formal linguistics remain dominated by those invented for, and most suitable for, English and other European languages. Linguistic theory remains unduly weighted in favor of European languages even as its extension to the variety of the world’s languages involves undue complication thereby revealing its inadequacy in representing language universally. Scharf’s current research aims to develop universally adequate linguistic theory by investigating sophisticated Indian linguistic theories, structures, and procedures developed to describe Sanskrit, the structure of which is of a very different character from English.

India developed an extraordinarily rich linguistic tradition over more than three millennia that remains under-appreciated and under-investigated. Scharf described this tradition for the Oxford handbook of the history of linguistics (§7.2 #14) as well as in presentations (§8.1 #43). [§ references in parenthesis refer to sections in the résumé in list format.] By the middle of the first millennium \textsc{bce} six branches of knowledge ancillary to Vedic texts proper and known as `limbs of the Veda’ (vedāṅga) included four concerned with linguistic analysis: metrics (chandas), etymology (nirukta), phonetics (śikṣā), and grammar (vyākaraṇa). In addition the highly developed philosophical disciplines of logic (nyāya), and ritual exegesis (karmamīmāṁsā), and the polished discipline of literary theory (alaṅkāraśāstra) are concerned with semantic and syntactic analysis. A cursory glance at the long tradition of discussion and argumentation within and between Indian sciences of phonetics (śikṣā), grammar (vyākaraṇa), logic (nyāya), ritual exegesis (karmamīmāṁsā), and literary theory (alaṅkāraśāstra) reveals that Indian linguistic traditions have much to offer contemporary linguistic theory in the areas of phonetics, morphology, syntax, and semantics.

In particular, the Pāṇinian grammatical tradition achieves a sophisticated analysis of Sanskrit that is hardly matched in contemporary linguistics. By the early fourth century \textsc{bce} Pāṇini had composed the Aṣṭādhyāyī, consisting of nearly 4,000 rules, that constitutes a fairly complete generative grammar of late Vedic Sanskrit. The abundant literature in the form of commentaries and subcommentaries on the Aṣṭādhyāyī add rich detail and sophisticated interpretive theory. Yet despite more than two thousand years of analysis and refinement of Pāṇini’s systematic analysis in the Indian tradition and two centuries of investigation by modern Indologists, only now with the application of computational methods is it possible to realize the potential contribution of the work fully.

The Indian tradition produced copious lexical resources for Sanskrit beginning with a thesaurus of Vedic terms called the Nighaṇṭu in the middle of the first millenium and exemplified by the Amarakośa. Modern monolingual and bilingual lexicons and dictionaries as well as descriptive grammars casually describe semantic content and grammatical relations. In contrast, Pāṇini’s grammar creates a formal relationship between semantics and speech forms by introducing basic elements and modifying them under specific semantic and syntactic conditions. This formal relationship meets strict contemporary conditions of formal computational semantics. While commentators demonstrate the derivations of speech forms in the Pāṇinian system for selected examples, and seventeenth century cognitive linguists such as Kauṇḍabhaṭṭa and Nāgeśa synthesize conclusions of the grammatical tradition concerning the semantics associated with various morphological elements, comprehensive synthesis of the description provided by Pāṇini has never been achieved.

Scharf’s current research contributes to building a bridge between the ancient and the modern, between the difficult to-penetrate humanistic discipline of Indology and the rigorous formal science of contemporary linguistics. He investigates ways in which Indian linguistics can contribute useful insights to contemporary formal linguistics, and designs ways in which Indian linguistic theories may be formalized and implemented computationally. In July 2017, he completed a formalization of Pāṇini’s Aṣṭādhyāyī in an XML language he developed that is translatable into executable Java code. He is currently developing a computational implementation of of this formalization that will produce a comprehensive lexicon of Sanskrit. This lexicon will evince complete Pāṇinian derivations with all their semantic and cooccurrence conditions and will serve as a formal semantic analysis of the Sanskrit terms so derived. The Pāṇinian lexicon will serve as the core of a lexical table that coordinates headwords of numerous lexical resources in the Sanskrit Library’s integrated dictionary. The Pāṇinian analysis contained in the lexicon will also serve to constrain homophonous speech forms in Sanskrit parsing software.

Scharf is currently engaged in translating Kauṇḍabhaṭṭa’s Vaiyākaraṇabhūṣaṇasāra. The structure of verbal cognition described in Kauṇḍabhaṭṭa’s work will serve as a model according to which to formalize semantic relations that can be projected onto syntax for use in computational syntactic analysis. The computational implementation of Pāṇinian grammar and the Vaiyākaraṇabhūṣaṇasāra translation project grew out of the research project he conducted in a Chaire Internationale de Recherche Blaise Pascal 2012–2013.

4. Chaire Blaise Pascal

Between February 2012 and July 2013, Scharf was the laureate of a Chaire Internationale de Recherche Blaise Pascal in the Laboratoire de l’Histoire des Théories Linguistiques at the Université Paris Diderot (§3 #14). His research project there focused on Indian semantic and syntactic theory and the semantics-syntax interface where computational linguistic work is flourishing. He drew upon selected major semantic and syntactic treatises in the Indian grammatical tradition and contemporary techniques of formalization and computational implementation to bring ancient Indian theories face to face with contemporary computational linguistic work in a series of lectures (§8.1 #42, #44, #46–#48, #53–#56). On the one hand, the lectures articulated Indian theories in contemporary terms and offered a critique and insights useful to contemporary linguists. On the other hand, the lectures suggested ways of modeling ancient Indian theories computationally in order to allow computational modeling to clarify those ancient theories and assist in answering difficult questions regarding their principles and historicity.

Under the project, Scharf and his assistants developed a morphologically and syntactically tagged database of Sanskrit texts to facilitate syntactic research on Sanskrit. The project culminated in the Seminar on Sanskrit Syntax and Discourse Structures, 13–15 June 2013 at the Université Paris Diderot and in a workshop on computational Sanskrit syntax, co-hosted with Gérard Huet 17–21 June at the Université Paris Diderot and INRIA Paris (§6.2 #26–#27). The announcement and full program of the seminar, with links to abstracts and papers can be found at the Sanskrit Library website under Events ( Scharf edited a volume of papers selected from those presented by speakers at the Sanskrit syntax seminar (§7.1 #9). The volume includes a comprehensive bibliography of research on Sanskrit syntax conducted over the last 125 years.

5. Current and recent projects

Scharf has headed several projects to integrate digital texts, lexical resources, and linguistic software and to enhance access to Sanskrit manuscripts by integrating digital images of them with corresponding digital editions, a comprehensive dynamic hypertext catalogue, and text-image alignment software. In 2014, he completed a project entitled “Sanskrit lexical sources: digital synthesis and revision,” which digitized and integrated major bilingual dictionaries, specialized dictionaries, and traditional thesauri into an integrated dictionary interface in collaboration with Thomas Malten at the University of Cologne with funding from the NEH and the Deutsche Forschungsgemeinschaft. In 2016, he completed two projects (§5.2): The first, entitled, “Developing automated text-image alignment to enhance access to heritage manuscript images,” researched locating passages in digital editions of texts in manuscript images. The second, entitled, “Enhancing access to primary cultural heritage materials of India: cataloging, digitizing, and integrating the Houghton Library’s Indic Manuscript collection with intelligent digital resources,” catalogued the entire collection of 1,700 Sanskrit manuscripts at Harvard University. These and previous projects are described in detail at The following subsections describe them in brief.

5.1. International digital Sanskrit library integration

The International digital Sanskrit library integration project (§5.2 #12) created a globally distributed, internet-based digital library in Sanskrit from formerly independent projects. The project integrated projects to create Sanskrit digital archives, digital lexica, and linguistic software; to establish text-encoding standards; to enhance ancient and medieval manuscript access; and to develop OCR technology, and display software. The resulting integrated information system enriches access to digital content in Sanskrit located worldwide and thus enables broad use of this material for research and education. The ready accessibility of web-based materials is especially significant for less commonly taught languages such as Sanskrit.

The project standardized Sanskrit text-encoding (§7.1 #6), revised the Unicode Standard to include characters necessary for Indic cultural heritage (§7.2 #8–#11; §7.4 #1–#14), supplied validated data for optical character recognition, prepared the major digital Sanskrit-English lexicon for integration with linguistic software (§7.6 #16), produced several other digital lexical resources (§7.1 #4–#5; §7.6 #15), produced a full-form Sanskrit lexicon and morphological analyzer, and created XML editions of more than a hundred Sanskrit texts linked to analysis and lexical resources (§7.1 #7). Digital texts provided by the TITUS project at the University of Frankfurt and other sources were linked with lexical resources digitized at the University of Cologne using the morphological analyzer developed under the project. In lines of text in which inter-word sound changes (sandhi) have been analyzed, clicking a word leads to the Sanskrit Library’s morphological analyzer which in turn links to the Sanskrit Library’s integrated dictionary interface ( Clicking a line of text in which sandhi has not been analyzed opens links to the Sanskrit Heritage reader companion developed by Gérard Huet at the Institut National de Recherche en Informatique et en Automatique (INRIA).

This project also fostered international collaboration in the area of Sanskrit computational linguistics. Scharf convened the Second International Sanskrit Computational Linguistics Symposium (§6.2 #18) at Brown University under the project and edited selected papers presented at it (§7.1 #3). The Sanskrit Computational Linguistics Consortium, founded at the event, continues to culture progress in the development of OCR of Indic scripts, critical editing software, generative grammars, parsing software, semantic networks, machine translation, tagged corpora, and integrated Sanskrit library software.

5.2. Enhancing access to primary cultural heritage materials of India

The project, “Enhancing access to primary cultural heritage materials of India: integrating images of literary sources with digital texts, lexical resources, linguistic software, and the web” aimed to enhance access to primary cultural heritage materials of India housed in American libraries by integrating them with the digital texts, lexical resources, and linguistic software in the Sanskrit Library. The project selected a small but important set of texts represented in the Indic manuscript collections at Brown University and the University of Pennsylvania, and in the Sanskrit Library’s collection of digital texts. The Brown University Library and the Rare Books and Manuscripts Library at the University of Pennsylvania made high-quality digital images of ninety manuscripts of the great Indian epic Mahābhārata, and sixty-eight manuscripts of the preeminent Vaiṣṇava text Bhāgavata Purāṇa. Sanskrit Library assistants collected catalog data and inserted that data into the XML template Scharf made. The template incorporates the comprehensive parameters developed by the American Committee for South Asian Manuscripts and the Text Encoding Intiative’s (TEI) Manuscript guidelines. After Scharf completed editing the catalog (§7.1 #8) in May 2012, he and Amey Huchins, the catalog librarian at the Rare Books and Manuscripts Library at the University of Pennsylavania, worked out parameters to map the completed catalog data onto the standard MaRC records used by libraries. Bunker, the software engineer for the project, used this mapping to write code to transform the TEI-conformant XML catalog records to standard MaRC records for inclusion in the University of Pennsylvania Library’s catalog. Bunker also wrote a sophisticated search interface that provides access to manuscripts by way of a number of parameters including author, title, scribe, place, and text contained in passages transcribed. The lists of authors, titles, scribes, places, and other significant terms enrich the information provided by major Sanskrit lexical sources currently available, and constitute an additional lexical source that will be added to the Sanskrit Library’s integrated dictionary interface. Scharf described the project and its results in several presentations (§8.1 #32, #39, #49–#50, #57) and publications (§7.2 #19, #21–#24).

5.3. Sanskrit lexical sources: digital synthesis and revision

A project jointly funded by the NEH and the Deutsche Forschungsgemeinschaft extended the Sanskrit Library’s integrated dictionary interface by integrating supplements to the major bilingual dictionaries already included, and by adding specialized dictionaries, indigenous Indian monolingual dictionaries, traditional thesauri, and traditional linguistic analyses. The integrated dictionary includes major bilingual dictionaries such as Monier-Williams’ A Sanskrit-English dictionary, major monolingual Sanskrit dictionaries such as the Vācaspatya, and Śabdakalpadruma, and minor obscure specialized lexical resources such as a dictionary of Sanskrit words used for numbers (bhūtasaṁkhyā). The valuable information provided by the latter, which is based upon an article published in a Japanese journal, would otherwise remain beyond the reach of most Sanskrit scholars. By providing access to such sources, the integrated dictionary interface vastly eases access to specialized linguistic resources now consulted only by experts. The project links these lexical resources with digitized Sanskrit texts as described in section \ref{sec-nsf-idsli} above.

5.4. Developing automated text-image alignment

The project researched methods to enhance access to primary cultural heritage materials of India by developing human-validated automated text-image alignment techniques in order to provide access to digital images via related machine-readable texts, lexical resources, linguistic software, and a sophisticated search interface. Digital images of manuscripts written in Sanskrit will be integrated into the Sanskrit Library. This integration will allow generalized information extraction and search techniques to reach enormous reservoirs of Sanskrit manuscripts. Integrating primary cultural materials with the Sanskrit Library will thus enable broad use of Indic collections for research and education where Indic materials are grossly underrepresented.

5.5. Cataloging the Houghton Library’s Indic Manuscript collection

This project cataloged all the Sanskrit manuscripts in the Houghton Library at Harvard University. The project completed the cataloguing of the entire collection of 1,700 Sanskrit manuscripts in the Houghton Library. Experts in Indic manuscriptology arranged and fully catalogued each manuscript using the Sanskrit Library’s template developed in the project described in section \ref{sec-brown-penn}. Scharf and associates are still adding references to and editing entries, but drafts are available in the Sanskrit Library’s digital manuscript catalogue. The Sanskrit Library’s catalogue is accessible at

6. Prior research

While fascinated with the whole range of Sanskrit language, Indian philosophy, and Indian literature, Scharf has been particularly interested in the intellectual history of Indian linguistics and philosophy of language, in the development of conceptions of the self, and in the creativity expressed through the adaptation of ancient narratives in new contexts.

6.1. Indian philosophy of language

One of the central questions of philosophy concerns the basis of general conceptions. The problem extends from that of identifying an object or one’s self to be the same thing as encountered previously to that of classifying objects as of the same type. Both problems are intimately bound to linguistic usage. Both problems pervade the history of Indian philosophy and particularly the Indian philosophy of language.

Scharf’s first book (§7.1 #1) considered various points of view regarding whether general properties exist independently of conception and serve as the grounds for the use of common nouns. In it he examined the arguments of Buddhists and the counterarguments in Mīmāṁsā and Nyāya regarding the existence of independently existing grounds for the use of words and, in particular, the existence of general properties. He then compared Patañjali’s views and arguments in the Mahābhāṣya with those of Vātsyāyana in the Nyāyasūtrabhāṣya and Śabara in the Mīmāṁsābhāṣya regarding what conditions the use of common nouns. Patañjali and Vātsyāyana conclude that common nouns cause awareness of both the class property and the individual object in which the class property inheres because considerations of gender, number, and actions to be performed necessitate comprehension of individuals of the class. Śabara in contrast concludes that common nouns cause cognition only of class properties and that it is the context, rather than the speech form, that brings individuals of the class to awareness. Śabara’s argument that the use of common nouns for likenesses proves his view because likenesses are not individuals of the class, however, involves the fallacious identification of a class property with shape.

Despite confusion concerning the class properties in early Mīmāṁsā, Scharf’s papers on class properties (ākr̥ti) (§7.3 #1–#2) and on intentionality (vivakṣā) (§7.3 #4) demonstrated that the concepts of a class property and a speaker’s intention could not have undergone the process of historical maturation between the second century \textsc{bce} and the fifth century \textsc{ce} assumed by certain scholars because the concepts of a class property and a speaker’s intention are clearly articulated in Patañjali’s Mahābhāṣya (150 \textsc{bce}). “Pāṇini, vivakṣā, and kāraka-rule-ordering” (§7.2 #1) examined evidence that Pāṇini himself assumed the principle of a speaker’s intention. The article demonstrated that at least one sūtra in the Aṣṭādhyāyī articulates the principle, and that Kātyāyana, as well as Patañjali clearly understood that linguistic convention limits a speaker’s intention in the effective use of language.

The presentations about recognizing speech (§8.1 #4; §8.2 #6) uncovered another case where proper elucidation of Patañjali’s Mahābhāṣya showed that the concept of grasping the speech form prior to its bringing its denoted object to awareness was clearly understood at a period earlier than certain scholars had assumed.

The paper on Kauṇḍabhaṭṭa’s views concerning the semantic conditions for kārakas (§8.2 #4) and the articles dealing with levels (§7.2 #5–#6) demonstrate that Pāṇini does not conceive of distinguishable psychological or procedural levels besides two related domains: meaning and speech. In recent discussions with one proponent of four levels, Kiparsky agreed that by `levels’ he meant only distinct modes of reference, not psychological domains nor separate modules.

“Pāṇini’s use of prohibitive compounds” (§7.3 #3) examined the commentaries on sūtras that derive nouns from verbal roots in the Aṣṭādhyāyī in which the term anupasarge occurs, and demonstrated that the term must be understood as a compound in which the elements are not syntactically related (asamarthasamāsa) in order that it have the sense required by the rules.

“The natural-language foundation of metalinguistic case-use in the Aṣṭādhyāyī and Nirukta” (§7.2 #3) concerns the syntax of metalanguage and its conceptual presuppositions in the works of Pāṇini and Yāska. The article demonstrated that the metalanguage in both texts similarly utilizes both genitives and ablatives to signify the speech form that is the derivational source of a derivate. The usage originates in the overlapping domain of usage of the two cases in syntactic connection with direction words and in lineages in ordinary usage.

“The relation between etymology and grammar in the linguistic traditions of early India” (§7.2 #16) addresses the persistent problem of the historical priority of the Aṣṭādhyāyī and Nirukta. The article argues that the Nirukta is a multi-layered text the oldest layer of which has a lexical focus and antedates the Aṣṭādhyāyī but the later layers of which display cognizance of sophisticated systematic procedures of analysis used in Pāṇini’s work and post-date his work.

There was a recent debate concerning whether Pāṇini’s procedure of deriving correct speech forms begins with semantics, or first posits an approximate speech form as a target that it validates or corrects. “On the semantic foundation of Pāṇinian derivational procedure” (§7.3 #8), demonstrated that the former is the case through a careful examination of the derivation of the compound kumbhakāra and the commentaries on the rules involved. The grammar is a systematic projection of semantics onto speech that begins with objects and relations solely within the semantic domain prior to the introduction of any speech form.

“On the source of the cognition of time in verbs” (§7.2 #36) examines whether Indian linguists consider the verbal root or the verbal inflectional suffix to generate the cognition of time in verbal cognition.

6.2. Linguistics

The request to contribute the chapter on Linguistics in India to the Oxford handbook of the history of linguistics (§7.2 #14) recognized Scharf as an authority on Indian linguistic systems.

Pāṇinian linguistic description

Because linguistic terminology and analysis pervade commentary in every field, and linguistic treatises describe the language in which treatises in every field are written, progress in the intellectual history of Indian linguistics promises to have far-reaching influence on Indian intellectual history generally. Scharf has undertaken to evaluate the descriptions of language undertaken by various ancient Indian linguists and to compare these descriptions with extant Sanskrit texts. “Pāṇinian accounts of the Vedic subjunctive,” (§7.3 #7) evaluates the adequacy of competing accounts of the subjunctive to account for Vedic usage and concludes by preferring the account that depends less on escape rules. The paper likewise argues that comprehensive evaluation of the linguistic system and the text is required to evaluate the degree of correlation between the linguistic description and the text and that the procedure of examining selected individual forms and rules used by scholars previously is inadequate because different cases lead to contradictory results.

While the previous paper considered the relation of grammatical rules to Vedic forms, “Pāṇinian accounts of the class eight presents” (§7.3 #6) considered how variation in lists that supplement the set of rules, particularly the dhātupāṭha, alters the linguistic description of the linguistic system that comprises those lists. The paper evaluates the adequacy of competing linguistic accounts of verbal forms proffered by commentators on root-lists (dhātupāṭha) that represent and categorize roots differently. In particular the etymologically infelicitous inclusion of the root r̥ṇ in class eight instead of in class five allows the linguistic system to account for the appearance of the form arṇavat in the Atharvaveda without modifying any rules.

The recognition that the dhātupāṭha plays an essential role in the Pāṇinian linguistic system led Scharf to analyze the dhātupāṭha given in the Mādhavīya Dhātuvr̥tti in comparison with the rules of the Aṣṭādhyāyī, to restore roots to the canonical form expected by Pāṇinian rules and to prepare a digital edition and index (§7.1 #4-#5; §8.1 #18; §8.2 #17) (

In other work on the dhātupāṭha (§8.2 #23), Scharf evaluated arguments for and against the view that the list originally included meanings. Although he supported the view that the list generally did not include meanings, he demonstrated that at least some semantic conditions had to be included to prevent contradictions.

Pāṇinian procedure

In a number of articles, Scharf considered problems exposed by his long-term project of formalizing Pāṇinian procedure. Computational formalization forces one to deal systematically with issues and brings together data in new ways that permits new insights.

Pāṇinian rules occasionally include conditions that refer to a subsequent state of derivation. “Rule-blocking and forward-looking conditions in the computational modeling of Pāṇinian derivation” (§7.2 #12) analyzes one such situation in the derivation of perfect active participles and describes a computational solution that delays the decision to accept the result of applying the rule until the subsequent referenced state has been reached.

More difficult is how to determine which of conflicting rules takes precedence. When the domain of a rule is wholly contained within the domain of another rule, the rule with the narrower domain must take precedence just by virtue of the fact that it was stated; otherwise it would have no scope. However when each of two rules with overlaping domains has scope in its own proper domain some criterion is required to determine which takes precedence in the overlapping domain. “Rule selection in the Aṣṭādhyāyī or Is Pāṇini’s grammar mechanistic?” (§7.2 #13) carefully examines criteria proposed by ancient Indian grammarians and modern scholars but finds no single consistent universal solution. The examination leaves in place the solution proposed by Patañjali, that the desired rule applies, despite the fact that it is deemed unsatisfactory because it renders the operation of the grammar subject to knowledge of its outcomes. The paper concludes that a complete examination of the complex problem, which has and could only be examined partially otherwise, will require the assistance of computational modeling. “On the resolution of conflict between accentual rules and other rules of derivation in Pāṇinian grammar” examines Rāmacandra’s solutions to the conflict between the accentual rule A. 6.1.186 and the single replacement rule A. 6.1.97 in the derivation of the third person plural verbal form pacanti, accepts his conclusion of the priority of the accent rule but by reason of the inclusion of the term upadeśa referring to the state of original instruction, rather than the reasons he gives.

The section of rules in the Aṣṭādhyāyī that introduces stem-forming affixes (vikaraṇa) is interpreted by commentators as containing locatives that refer to the right context of the affix to be introduced. This interpretation is accepted by modern scholars too. Yet such an interpretation necessitates complicating the derivation of accents which in Pāṇinian grammar are adjusted at each step of derivation. “Teleology and the simplification of accentuation in Pāṇinian derivation (§7.2 #17) argues that the complication is removed by interpreting the locatives as locatives of domain rather than right context locatives. With a locative of domain, the rules introduce the stem-forming afixes on condition that certain referenced forms will be introduced later. A decision-delay procedure such as was described in the article on forward-looking conditions above (§7.2 #12) avoids indeterminism in these rules. The result is a more elegant systematic description of accentuation. “On the status of nominal terminations in upapada compounds” (§7.2 #32) reinforces this conclusion by examining rules that form nominal derivates that occur only as the final constituents of compounds. The paper demonstrates that even when the subordinate constituents in such compounds are referred to explicitly with terms denoting nominal terminations, the nominal terminations cannot be present at that stage of the derivation. Conversely, “Are taddhita affixes provided after prātipadikas or after padas?” (§7.2 #33) shows that because Pāṇinian procedure does indeed require nominal terminations on speech forms after which taddhita affixes are introduced, it is necessary to retain application of the term aṅga `stem’ conjointly with the term pada `word’ in order for required operations to take place, despite the fact that headings under which these terms are introduced indicate that the latter term alone should remain applicable. ``A computational implementation of Pāṇini’s derivation al morphology of Sanskrit” (§7.2 #37) considers the interplay of formal affixation headings and subordinate headings that state semantic conditions in the taddhita section concerned with the derivation of secondary nominal derivates. The paper argues that Pāṇini operated with a constrained multiple inheritance structure that effeciently maps affixes to meanings to account for the complex homonymy and synonymy of the derivates.

“Modeling Pāṇinian grammar” (§7.2 #5) compares obvious methods to implement a few aspects of Sanskrit grammar computationally, comments upon the degree to which they approach or depart from Pāṇinian methodology and exemplifies methods that would achieve a closer model. The question of levels and the role of semantics are dealt with at some length. The article demonstrates the extent to which the grammar is founded upon semantics and concludes that Pāṇini conceived of just two levels, meaning and sound, generating the latter from the former. “An XML formalization of the Aṣṭādhyāyī” (§7.2 #29) describes the structure of a formalization of the rules of the Aṣṭādhyāyī amenable to computational implementation of those rules while “Some issues in formalizing the Aṣṭādhyāyī” (§7.2 #30) describes various problems in the Pāṇinian description brought to light by the attempt to formalize these rules and the solutions proposed to solve them.

Pāṇinian commentary

Commentaries on Pāṇini’s Aṣṭādhyāyī, besides helping to elucidate the grammar, its procedures, and its description of linguistic phenomena, present interesting exegetical, critical, and historical problems. “Counterexamples (pratyudāharaṇa) in Pāṇinian Grammar” (§7.2 # 24) examines the syntax and function of counterexamples in the Kāśikā commentary, isolates the defining feature of the counterexample and reveals textual problems where counterexamples do not conform to this feature.

Computational linguistics

“An analytic database of the Aṣṭādhyāyī” (§7.2 #18) describes the database of Pāṇini’s Aṣṭādhyāyī Scharf created in 1991. The database analyzes sandhi in sūtras, provides morphological identification of each word, analyzes compounds, indicates recurrence of terms, indicates the type of each sūtra and the significance of the case used in each word. It likewise provides extensive classification of grammatical elements including roots, gaṇa-elements, affixes, augments, markers, and pratyāhāras and other technical terms.

Sanskrit presents a number of challenges for computational linguistic processing including rich morphology, an enormous corpus and the fact that sandhi and script conventions obscure word boundaries. The coauthored article, “A distributed platform for Sanskrit processing,” (§7.2 #20) describes innovative solutions to these problems developed through international collaboration. Solutions include efficient segmenting and tagging algorithms, dependency parsers based on constraint programming, and the integration of lexical resources, text archives and linguistic software through distributed interoperable Web services.


Scharf’s article entitled, “Vedic accent: underlying versus surface,” (§7.2 #15) and related papers (§8.1 #12, #47) evaluated the descriptions of accentuation in various Vedic phonetic treatises (prātiśākhya) and compared them with the practices of marking accent in editions and manuscripts of the corresponding Vedic texts. The article distinguished two distinct traditions of recitation commonly confounded, an earlier one, exemplified in the Aṣṭādhyāyī and Vājasaneyiprātiśākhya, in which the high pitch (udātta) is the highest tone and a later one, exemplified in the R̥kprātiśākhya, in which the circumflex (svarita) is recited higher than the high pitch. The traditions agree in marking what is recited highest with a vertical line above but differ in which underlying accent is so marked. Concurrently, under the purview of the digital Sanskrit library project described above, Scharf compiled a detailed account of accent marks and other Vedic characters in order to prepare a proposal to extend Indian script code blocks in the Unicode standard to allow adequate representation of Vedic (§7.2 #8–#11; §7.4 #1-14, §7.6 #17). A recent paper (§8.2 #50) demonstrated that the R̥kprātiśākhya system influenced the descriptions and understanding of accent in other Vedic traditions. Several other papers and presentations consider issues in the derivation of accent in Pāṇinian grammar (§7.2 #17 #35, §8.1 #51, §8.2 #30).


“Clause-initial dvayám” (§7.3 #5) corrected the syntactic analysis of a passage containing the term dvayám in the Śatapathabrāhmaṇa by considering relevant Pāṇinian rules, sections in modern grammars and the related ritual texts.

“Interrogatives and word-order in Sanskrit” (§8.2 #7, #11; §7.2 #25) refuted a contemporary theoretical over-generalization that constrained the position of interrogatives to sentence-final position in head-final languages. The paper not only demonstrated that the particular serialization does not hold for Sanskrit but in addition demonstrated that the premise that roles are associated fundamentally with position is misguided, even if there is an established usual order.

Scharf edited a volume of papers presented at the seminar on Sanskrit syntax he organized in Paris 13-15 June 2013, of which the program is available at (§7.1 #8). The volume includes three papers for which computational syntactic research on Sanskrit was carried out under his direction (§7.2 #26–#28). The first demonstrated a significant departure in the order of complements and heads in poetry from their order in prose. While agents, objects, adverbs and instruments preceded their head verbs, and qualifiers and genitives preceded what they qualified ninety percent of the time in prose, most complements preceded their heads only sixty-six percent in poetry; qualifiers and instruments preceded their heads only about forty percent of the time. The second paper analyzed the explicit and implicit provisions Pāṇini makes concerning the cooccurrence conditions of preverbs and verbal roots and utilized the analysis in a computational analysis of the Pāṇiniian account of middle voice (through the provision of ātmanepada terminations). The third paper presented Sanskrit metrical-analysis software based upon Kedārabhaṭṭa’s Vr̥ttaratnākara.

Phonetics and encoding

Linguistic issues in encoding Sanskrit (§7.1 #6) surveyed technologies for representing the Sanksrit language in writing (Indic scripts, particularly Devanāgarī script and its standard Romanization) and in digital encoding systems (from legacy fonts and meta-encodings to Unicode), analyzed the history of encoding in terms of transitions in the medium of knowledge transmission, critiqued encoding systems and scripts on the ground of fundamental principles of precision in information transmission, analyzed Sanskrit phonology, and designed accurate and consistent segmental, featural, and ascii-based encoding schemes for Sanskrit. “Linguistic issues and intelligent technological solutions in encoding Sanskrit” (§7.3 #9) makes a concise presentation of the essential issues. Current encoding systems reproduce deficiencies inherent in traditional writing systems. The contemporary use of computers for the manipulation of linguistic and textual data demands more relevant information-processing principles. Encoding is relative to the information to be conveyed in the structure of the langauge represented. Distinctive elements should be encoded consistently and unambiguously. Doing so requires selecting phonic or graphic units, segments or features, and determining precise criteria for contrasting elements.

6.3. Indian Philosophy

The request for his contributions to the Routledge Encyclopedia of Hinduism recognized Scharf’s expertise in soteriological branches of Indian philosophy (Sāṅkhya, Yoga, and Vedānta). He wrote some twenty-seven articles dealing with various systems of Indian philosophy, including the variant lists of which systems there are, and modes of evidence used in them. Prominent are articles on Sāṅkhya, Yoga, karma, ātman, and brahman (§7.2 #4).

6.4. The Self

Scharf has been interested in concepts of the self in both European and Indian philosophy since he was an undergraduate. He developed these ideas during a seminar he taught on Yoga philosophy at the University of Virginia (§9 #5), presented them at lectures at Pennsylvania State University (§8.1 #1), and explored them at Brown in a course on concepts of the self (§9 #3), and in Sanskrit reading courses in the Upaniṣads, Śaṅkara’s commentary on the Bhagavadgītā, Patañjali’s Yogasūtra, and Buddhist philosophy (§9 #8b). This led him to elucidate the nature of consciousness in the Br̥hadāraṇyaka Upaniṣad in his paper on sañjñā (§8.2 #12). In his piece, “Creation Mythology and Enlightenment,” (§8.1 #5) he explored how themes in Indian mythology concerning the origin of the universe complement philosophical themes concerning the full discovery of the self. He has distant plans to retranslate Patañjali’s Yogasūtra and Vyāsa’s Yogabhāṣya on the basis of a study of Nāgeśa’s untranslated commentary.

6.5. Narrative Adaptation

One of the most enjoyable pursuits in the study of Sanskrit literature is to explore the motivations for the adaptation of narratives in subsequent versions. In advanced Sanskrit reading courses he often traced the interpretation of a Vedic myth in various genres through the history of Indian literature (§9 #8d). One such myth is that of Purūravas and Urvaśī. In a paper entitled, “The Compassionate Urvaśī” (§8.1 #7; §8.2 #5) he reexamined the interpretation of R̥gveda 10.95 in subsequent Indian literature, including in Ṣaḍguruśiṣya’s Vedārthadīpikā. Because Ṣaḍguruśiṣya’s work is full of narratives related to various R̥gvedic hymns, he was led to collect manuscripts of the work for the purpose of making a critical edition. With the assistance of a grant from the American Philosophical Society (§5.2 #2) he collected some sixty manuscripts of the text. Similarly, tracing the interpretation of the story of Rāma in various genres in advanced Sanskrit classes led to his book on the Rāmopākhāna (§7.1 #2), and to papers on the ethics of the final episodes and Sītā’s divinity (§8.1 #22). Editions and even more so translations flatten the texture of a work thereby presenting it as a more integrated whole. Scharf’s work digitally cataloguing manuscripts of the Mahābhārata in the project described in \ref{sec-brown-penn} revealed interesting facts about the transmission and constitution of that text. His paper, “Five jewels in the University of Pennsylvania’s Rare Book and Manuscript Library,” examines the relationship of five subsections of the Mahābhārata that were often transmitted as a unit and corroborates the view that these works were originally independent pieces. Exploration of the methods of, and motivations for, the adaptation of ritual practice are equally intriguing. His student Kartik Venkatesh and he analyzed the structure and adaptation of pūjā in five major Indian festivals and have nearly completed a book on the topic.

In a paper presented at the Montreal Mahābhārata Conference and published as a chapter in Rukmani’s book on the Mahābhārata, he argued for publication of multi-level text (§7.2 #2). In contrast to a translation devoid of notes that presents the reader with a single, static, flat version of a narrative, multilevel text that includes comments, notes, alternatives, multimedia accompaniments, etc. can communicate the source work more fully and meaningfully. His book on the Rāmopākhyāna (§7.1 #2) is just such a work.

In numerous presentations and writings he has argued that we are in the midst of a major media-transition (§8.1 #9–#11; §7.1 #6; §7.3 #9). The transition from printed to digital media is comparable to the transitions from oral to written media, and from written to printed media. Productions of human knowledge that don’t convert to the new media recede from public awareness and perish in oblivion. In order to ensure that the vast body of knowledge contained in Sanskrit texts survives the transition to digital media, and to take advantage of the greater range and flexibility of presentation it offers, he founded a digital Sanskrit library (

6.6. Digital Sanskrit library

In order to facilitate comprehensive comparisons between various ancient Indian linguistic descriptions and Sanskrit texts, and to facilitate general access to Sanskrit literature, lexica, grammars, and manuscripts, he is engaged in a project to develop an integrated international digital Sanskrit library. After receiving several minor grants, he obtained a major grant from the NSF, 2006-2009 (§5.2 #12-#14). The project, funded by the National Science Foundation’s (NSF) Division of Intelligent Systems, integrated the linguistic software modeling Pāṇinian inflection and sandhi rules, developed by Scharf and his late colleague Malcolm Hyman at Brown, with bilingual lexical resources digitized in the Cologne digital Sanskrit lexicon project, and machine-readable Sanskrit texts in the TITUS archive at Frankfurt. Since then he has obtained four additional major grants from the National Endowment for the Humanities to catalogue and digitize Sanskrit manuscripts, develop image-text alignment software, and to digitize and integrate Sanskrit lexical resources (§5.2 #15–#16; §5.1 #1–#2). The Sanskrit Library is building an integrated digital library that allows seamless access to grammatical information and lexical sources by clicking words in texts, access to citations in context by clicking citations in lexical sources, and focused access to sought passages in manuscript images. The system will facilitate linguistic, philological, and topical research in Sanskrit generally much as the Perseus project has in Classical philology.

One task of the Sanskrit library project was to develop encoding standards. Already mentioned is the project to extend Indian script code blocks in the Unicode standard to accommodate special characters in Vedic. The Unicode standard is a script-based encoding. While using Unicode for display, he devised an independent phonology-based encoding scheme to facilitate internal linguistic processing and allow users to choose modes of display including both Roman and Devanāgarī. Hyman and he considered the linguistic issues in coding Sanskrit, or any language, in their book Linguistic Issues in Encoding Sanskrit (§7.1 #6). At, morphological software is located under `Tools’, integrated dictionary interface is available under `Reference’, and digital texts linked to analytic tools are available under `Texts’.

7. Teaching

Great ideas, profound insights, and penetrating discoveries are best appreciated and most inspiring when approached through the original expressions of the sages, scientists, and leaders who first articulated them. This is why foundational religious texts and great classics continue to attract seekers of knowledge far beyond their time of composition. Learning to access such works, preferably in their original language, guided Scharf’s own education in the history of philosophy in college and in Sanskrit studies in graduate school. Guiding students to appreciate such works for themselves is the guiding principle of Scharf’s teaching.

To gain true appreciation of ideas requires internalizing them so that they become a part of one’s intimate experience. The process of internalization requires active participation in the learning process and utilization of diverse avenues of learning: aural, visual, and hands-on. Thus principles of universal access govern Scharf’s preparation of teaching materials and class structure. Language materials, for instance, include carefully designed tables, charts, web-materials, and audio files, and class includes interactive oral work and visuals, as well as written exercises. Classes include lectures with prepared slide presentations as well as guided discussion using the socratic method.

Scharf taught an Introduction to Hinduism (§9 #2), a course on the Yogasūtra and commentaries (§9 #5), and a course on the Aitareya Brāhmaṇa (§9 #8diii) in the Department of Religious Studies at the University of Virginia in the spring of 1992. He taught a general course on South Asian Civilization (§9 #1), a course on Concepts of Self in Classical Indian Literature (§9 #3), and all levels and genres of Sanskrit literature (§9 #7–#8) for nineteen years in the Department of Classics at Brown University, where he was promoted to senior lecturer and served as concentration advisor and chair of the South Asian Studies Committee. There he wrote his own introduction to Sanskrit with audio materials, called Śabdabrahman, for use in his first-year classes. His independent-study reader Rāmopākhyāna: the Story of Rāma in the Mahābhārata (§7.1 #2) has garnered enthusiastic reviews and is widely used in Sanskrit classes. The digital version with its index, found under Pedagogy at the Sanskrit Library website, served as the model to develop two other Sanskrit texts for display in the Kramapāṭha reader: Pūrṇabhadra’s Pañcākhyānaka and Pāṇini’s Aṣṭādhyāyī. A fourth, Viṣṇu Purāṇa, Book 4, awaits final editing. In addition he developed video clips to demonstrate Devanāgarī character formation. Other Sanskrit exercises utilizing intelligent feedback systems that utilize transliteration, sandhi, and inflection software are under development. Scharf currently teaches courses in Indian cultural tradition, historical and comparative linguistics, Indian linguistic theory and Pāṇinian grammar at the Indian Institute of Technology Bombay.