Analytic Curriculum Vitae

1. Introduction

Peter Scharf currently teaches and directs research in Indian linguistics and digital humanities as a Visiting Professor at International Institute of Information Technology in Hyderabad. He taught and directed research in Sanskrit and Indian linguistics as a Visiting Professor in the Department of Humanities and Social Sciences at the Indian Institute of Technology Bombay for the previous three years, and is invited as a Fellow at the Indian Institute of Advanced Study in Shimla next year. He has nineteen years of experience teaching Sanskrit previously at Brown University and fourteen years of experience developing innovative research and instructional technology for Sanskrit as founder and director of the Sanskrit Library (https://sanskritlibrary.org).

Scharf earned his B.A. in philosophy at Wesleyan University and his doctorate in Sanskrit at the University of Pennsylvania, after which he taught Sanskrit and Indian literature at Brown University for 19 years where he was promoted to senior lecturer and served as concentration advisor and chair of the South Asian Studies Committee. Since 2011, he held several visiting professorships: Visiting Professor at the Maharishi University of Management Research Institute, Chaire Internationale de Recherche Blaise Pascal at the University of Paris Diderot, Visiting Professor of Sanskrit in the Department of Humanities and Social Sciences at the Indian Institute of Technology Bombay, and Visiting Professor in the Department of Sanskrit Studies at the University of Hyderabad. He is also the director of the Sanskrit Library which he founded in 2002. While his research focuses on the linguistic traditions of India, Vedic Sanskrit, and Indian philosophy, he has devoted considerable attention over the past several years to Sanskrit computational linguistics and building a digital Sanskrit library (https://sanskritlibrary.org). Sanskrit Library projects provide internet access to Indic lexical resources, texts, and manuscripts. His current research brings the linguistic traditions of India face to face with contemporary formal linguistics. He is developing computational implementations of Pāṇinian grammar, a morphologically and syntactically tagged corpus of Sanskrit texts and is investigating the use of Pāṇinian models of verbal cognition in computational syntax.

2. Priorities

Real progress in the discovery and propagation of knowledge in the humanities demands a predominant place for the fundamental disciplines of philology and archaeology. These disciplines are the foundation on which cultural studies is built. Systematic interpretive methods must enrich these disciplines; yet one must guard against them distracting attention from substantive issues and from retreating into empty formalism. As one of the world’s richest cultural heritage languages, and the primary cultural heritage language of South Asia, a predominant place should be recognized in the curriculum for Sanskrit.

Methods of research and distribution of knowledge need to keep step with available technology in order to be found relevant. Digital technology can enrich traditional philology by collecting, organizing, and presenting information to scholars in ways that may facilitate new insights. Scharf’s research and teaching utilize the best available methods from the ancient to the cutting edge.

3. Current research

Modern formal and computational linguistics was dominated by English at its inception and developed in subsequent decades primarily in the environment of European languages. More recently there has been a concerted effort to undertake formal linguistic analysis of a wide variety of languages, with particular interest in those with dramatically different features, and to enrich linguistic theory to account for linguistic variety. In spite of this effort, analytic structures and procedures utilized in formal linguistics remain dominated by those invented for, and most suitable for, English and other European languages. Linguistic theory remains unduly weighted in favor of European languages even as its extension to the variety of the world’s languages involves undue complication thereby revealing its inadequacy in representing language universally. Scharf’s current research aims to develop universally adequate linguistic theory by investigating sophisticated Indian linguistic theories, structures, and procedures developed to describe Sanskrit, the structure of which is of a very different character from English.

India developed an extraordinarily rich linguistic tradition over more than three millennia that remains under-appreciated and under-investigated. Scharf described this tradition for the Oxford handbook of the history of linguistics (§7.2 #14) as well as in presentations (§8.1 #43). [§ references in parenthesis refer to sections in the résumé in list format.] By the middle of the first millennium bce six branches of knowledge ancillary to Vedic texts proper and known as ‘limbs of the Veda’ (vedāṅga) included four concerned with linguistic analysis: metrics (chandas), etymology (nirukta), phonetics (śikṣā), and grammar (vyākaraṇa). In addition the highly developed philosophical disciplines of logic (nyāya), and ritual exegesis (karmamīmāṁsā), and the polished discipline of literary theory (alaṅkāraśāstra) are concerned with semantic and syntactic analysis. A cursory glance at the long tradition of discussion and argumentation within and between Indian sciences of phonetics (śikṣā), grammar (vyākaraṇa), logic (nyāya), ritual exegesis (karmamīmāṁsā), and literary theory (alaṅkāraśāstra) reveals that Indian linguistic traditions have much to offer contemporary linguistic theory in the areas of phonetics, morphology, syntax, and semantics.

In particular, the Pāṇinian grammatical tradition achieves a sophisticated analysis of Sanskrit that is hardly matched in contemporary linguistics. By the early fourth century bce Pāṇini had composed the Aṣṭādhyāyī, consisting of nearly 4,000 rules, that constitutes a fairly complete generative grammar of late Vedic Sanskrit. The abundant literature in the form of commentaries and subcommentaries on the Aṣṭādhyāyī add rich detail and sophisticated interpretive theory. Yet despite more than two thousand years of analysis and refinement of Pāṇini’s systematic analysis in the Indian tradition and two centuries of investigation by modern Indologists, only now with the application of computational methods is it possible to realize the potential contribution of the work fully.

The Indian tradition produced copious lexical resources for Sanskrit beginning with a thesaurus of Vedic terms called the Nighaṇṭu in the middle of the first millenium and exemplified by the Amarakośa. Modern monolingual and bilingual lexicons and dictionaries as well as descriptive grammars casually describe semantic content and grammatical relations. In contrast, Pāṇini’s grammar creates a formal relationship between semantics and speech forms by introducing basic elements and modifying them under specific semantic and syntactic conditions. This formal relationship meets strict contemporary conditions of formal computational semantics. While commentators demonstrate the derivations of speech forms in the Pāṇinian system for selected examples, and seventeenth century cognitive linguists such as Kauṇḍabhaṭṭa and Nāgeśa synthesize conclusions of the grammatical tradition concerning the semantics associated with various morphological elements, comprehensive synthesis of the description provided by Pāṇini has never been achieved.

Scharf’s current research contributes to building a bridge between the ancient and the modern, between the difficult to-penetrate humanistic discipline of Indology and the rigorous formal science of contemporary linguistics. He investigates ways in which Indian linguistics can contribute useful insights to contemporary formal linguistics, and designs ways in which Indian linguistic theories may be formalized and implemented computationally. In July 2017, he completed a formalization of Pāṇini’s Aṣṭādhyāyī in an XML language he developed that is translatable into executable code. He is currently developing a computational implementation of this formalization that will produce a comprehensive lexicon of Sanskrit. This lexicon will evince complete Pāṇinian derivations with all their semantic and cooccurrence conditions and will serve as a formal semantic analysis of the Sanskrit terms so derived. The Pāṇinian lexicon will serve as the core of a lexical table that coordinates headwords of the forty-five lexical resources in the Sanskrit Library’s integrated dictionary. The Pāṇinian analysis contained in the lexicon will also serve to constrain homophonous speech forms in Sanskrit parsing software.

The structure of verbal cognition described by Indian cognitive scientists in the seventeenth and eighteenth centuries, in particular Kauṇḍabhaṭṭa’s works, will serve as a model according to which to formalize semantic relations that can be projected onto syntax for use in computational syntactic analysis. Scharf described how cognitive structure is projected onto language in a paper, “Insights from Pāṇinian grammar and theory of verbal cognition for representing non-linear syntax: developing language-neutral syntactic representation,” (§7.2 #39) at a conference on new directions in linguistics held at the Indian Institute of Advanced Study, Shimla, at the end of October 2017. Scharf is currently engaged in translating Kauṇḍabhaṭṭa’s Vaiyākaraṇabhūṣaṇa, Vaiyākaraṇabhūṣaṇasāra and selected Sanskrit elucidations of the former to examine the structure of verbal cognition detailed in these works and to express that detail in his formalization of the Aṣṭādhyāyī. The computational implementation of Pāṇinian grammar and the Vaiyākaraṇabhūṣaṇa translation project grew out of the research project he conducted in a Chaire Internationale de Recherche Blaise Pascal 2012–2013.

4. Chaire Blaise Pascal

Between February 2012 and July 2013, Scharf was the laureate of a Chaire Internationale de Recherche Blaise Pascal in the Laboratoire de l’Histoire des Théories Linguistiques at the Université Paris Diderot (§3 #13). His research project there focused on Indian semantic and syntactic theory and the semantics-syntax interface where computational linguistic work is flourishing. He drew upon selected major semantic and syntactic treatises in the Indian grammatical tradition and contemporary techniques of formalization and computational implementation to bring ancient Indian theories face to face with contemporary computational linguistic work in a series of lectures (§8.1 #42, #44, #46–#48, #53–#56). On the one hand, the lectures articulated Indian theories in contemporary terms and offered a critique and insights useful to contemporary linguists. On the other hand, the lectures suggested ways of modeling ancient Indian theories computationally in order to allow computational modeling to clarify those ancient theories and assist in answering difficult questions regarding their principles and historicity.

Under the project, Scharf and his assistants developed a morphologically and syntactically tagged database of Sanskrit texts to facilitate syntactic research on Sanskrit. The project culminated in the Seminar on Sanskrit Syntax and Discourse Structures, 13–15 June 2013 at the Université Paris Diderot and in a workshop on computational Sanskrit syntax, co-hosted with Gérard Huet 17–21 June at the Université Paris Diderot and INRIA Paris (§6.2 #26–#27). The announcement and full program of the seminar, with links to abstracts and papers can be found at the Sanskrit Library website under Events (https://sanskritlibrary.org/syntaxParis/announcement.html). Scharf edited a volume of papers selected from those presented by speakers at the Sanskrit syntax seminar (§7.1 #9). The volume includes a comprehensive bibliography of research on Sanskrit syntax conducted over the last 125 years.

5. Current and recent projects

Scharf has headed several projects to integrate digital texts, lexical resources, and linguistic software and to enhance access to Sanskrit manuscripts by integrating digital images of them with corresponding digital editions, a comprehensive dynamic hypertext catalogue, and text-image alignment software. In 2014, he completed a project entitled “Sanskrit lexical sources: digital synthesis and revision,” which digitized and integrated major bilingual dictionaries, specialized dictionaries, and traditional thesauri into an integrated dictionary interface in collaboration with Thomas Malten at the University of Cologne with funding from the NEH and the Deutsche Forschungsgemeinschaft. In 2016, he completed two projects (§5.2): The first, entitled, “Developing automated text-image alignment to enhance access to heritage manuscript images,” researched locating passages in digital editions of texts in manuscript images. The second, entitled, “Enhancing access to primary cultural heritage materials of India: cataloging, digitizing, and integrating the Houghton Library’s Indic Manuscript collection with intelligent digital resources,” catalogued the entire collection of 1,700 Sanskrit manuscripts at Harvard University. These and previous projects are described in detail at https://sanskritlibrary.org/projects.html. The following subsections describe these projects and current research related to them in brief.

5.1. International digital Sanskrit library integration

5.1

The International digital Sanskrit library integration project (§5.2 #12) created a globally distributed, internet-based digital library in Sanskrit from formerly independent projects. The project integrated projects to create Sanskrit digital archives, digital lexica, and linguistic software; to establish text-encoding standards; to enhance ancient and medieval manuscript access; and to develop OCR technology, and display software. The resulting integrated information system enriches access to digital content in Sanskrit located worldwide and thus enables broad use of this material for research and education. The ready accessibility of web-based materials is especially significant for less commonly taught languages such as Sanskrit.

The project standardized Sanskrit text-encoding (§7.1 #6), revised the Unicode Standard to include characters necessary for Indic cultural heritage (§7.2 #8–#11; §7.4 #1–#14), supplied validated data for optical character recognition, prepared the major digital Sanskrit-English lexicon for integration with linguistic software (§7.6 #16), produced several other digital lexical resources (§7.1 #4–#5), produced a full-form Sanskrit lexicon and morphological analyzer (§7.6 #15), and created XML editions of more than a hundred Sanskrit texts linked to analysis and lexical resources (§7.1 #7). Digital texts provided by the TITUS project at the University of Frankfurt and other sources were linked with lexical resources digitized at the University of Cologne using the morphological analyzer developed under the project. In lines of text in which inter-word sound changes (sandhi) have been analyzed, clicking a word leads to the Sanskrit Library’s morphological analyzer which in turn links to the Sanskrit Library’s integrated dictionary interface (https://sanskritlibrary.org/integratedDictionaries.html). Clicking a line of text in which sandhi has not been analyzed opens links to the Sanskrit Heritage reader companion developed by Gérard Huet at the Institut National de Recherche en Informatique et en Automatique (INRIA).

This project also fostered international collaboration in the area of Sanskrit computational linguistics. Scharf convened the Second International Sanskrit Computational Linguistics Symposium (§6.2 #18) at Brown University under the project and edited selected papers presented at it (§7.1 #3). At the event, he co-founded the Sanskrit Computational Linguistics Consortium, which continues to culture progress in the development of OCR of Indic scripts, critical editing software, generative grammars, parsing software, semantic networks, machine translation, tagged corpora, and integrated Sanskrit library software. Scharf serves on the program committees of the symposia the consortium organizes, the sixth of which will take place at IIT Kharagpur in October, on program committees of the related sections at World Sanskrit Conferences, and will co-convene a conference that includes a section on Sanskrit, digital humantities, and computational linguistics hosted by the Institute of Tibetan Studies at the Sichuan University in Chengdo, China in April.

5.2. Enhancing access to primary cultural heritage materials of India

The project, “Enhancing access to primary cultural heritage materials of India: integrating images of literary sources with digital texts, lexical resources, linguistic software, and the web” aimed to enhance access to primary cultural heritage materials of India housed in American libraries by integrating them with the digital texts, lexical resources, and linguistic software in the Sanskrit Library. The project selected a small but important set of texts represented in the Indic manuscript collections at Brown University and the University of Pennsylvania, and in the Sanskrit Library’s collection of digital texts. The Brown University Library and the Rare Books and Manuscripts Library at the University of Pennsylvania made high-quality digital images of ninety manuscripts of the great Indian epic Mahābhārata, and sixty-eight manuscripts of the preeminent Vaiṣṇava text Bhāgavata Purāṇa. Sanskrit Library assistants collected catalog data and inserted that data into the XML template Scharf made. The template incorporates the comprehensive parameters developed by the American Committee for South Asian Manuscripts and the Text Encoding Intiative’s (TEI) Manuscript guidelines. After Scharf completed editing the catalog (§7.1 #8) in May 2012, he and Amey Huchins, the catalog librarian at the Rare Books and Manuscripts Library at the University of Pennsylavania, worked out parameters to map the completed catalog data onto the standard MaRC records used by libraries. Bunker, the software engineer for the project, used this mapping to write code to transform the TEI-conformant XML catalog records to standard MaRC records for inclusion in the University of Pennsylvania Library’s catalog. Bunker also wrote a sophisticated search interface that provides access to manuscripts by way of a number of parameters including author, title, scribe, place, and text contained in passages transcribed. The lists of authors, titles, scribes, places, and other significant terms enrich the information provided by major Sanskrit lexical sources currently available, and constitute an additional lexical source that will be added to the Sanskrit Library’s integrated dictionary interface. Scharf described the project and its results in several presentations (§8.1 #32, #39, #49–#50, #57) and publications (§7.2 #18, #20–#23).

5.3. Sanskrit lexical sources: digital synthesis and revision

A project jointly funded by the NEH and the Deutsche Forschungsgemeinschaft extended the Sanskrit Library’s integrated dictionary interface by integrating supplements to the major bilingual dictionaries already included, and by adding specialized dictionaries, indigenous Indian monolingual dictionaries, traditional thesauri, and traditional linguistic analyses. The integrated dictionary includes major bilingual dictionaries such as Monier-Williams’ A Sanskrit-English dictionary, major monolingual Sanskrit dictionaries such as the Vācaspatya, and Śabdakalpadruma, and minor obscure specialized lexical resources such as a dictionary of Sanskrit words used for numbers (bhūtasaṁkhyā). The valuable information provided by the latter, which is based upon an article published in a Japanese journal, would otherwise remain beyond the reach of most Sanskrit scholars. By providing access to such sources, the integrated dictionary interface vastly eases access to specialized linguistic resources now consulted only by experts. The project links these lexical resources with digitized Sanskrit texts as described in section above.

5.4. Developing automated text-image alignment

The project researched methods to enhance access to primary cultural heritage materials of India by developing human-validated automated text-image alignment techniques in order to provide access to digital images via related machine-readable texts, lexical resources, linguistic software, and a sophisticated search interface. The integration of digital images of manuscripts into the Sanskrit Library allows generalized information extraction and search techniques to reach enormous reservoirs of Sanskrit manuscripts. Integrating primary cultural materials with the Sanskrit Library thus enables broad use of Indic collections for research and education where Indic materials are grossly underrepresented.

5.5. Cataloging the Houghton Library’s Indic Manuscript collection

This project cataloged all the Sanskrit manuscripts in the Houghton Library at Harvard University. The project completed the cataloguing of the entire collection of 1,700 Sanskrit manuscripts in the Houghton Library. Experts in Indic manuscriptology arranged and fully catalogued each manuscript using the Sanskrit Library’s template developed in the project described in section 5.2. Scharf and associates are still adding references to and editing entries, but drafts are available in the Sanskrit Library’s digital manuscript catalogue. The Sanskrit Library’s catalogue is accessible at https://sanskritlibrary.org/catindex.html.

5.6. Digitizing and cataloging manuscript collections in India

Scharf is currently a consultant to a private foundation launching a major long-term project to digitize and catalogue millions of manuscripts at major research libraries in India and to integrate them with linguistic software.

6. Prior research

While fascinated with the whole range of Sanskrit language, Indian philosophy, and Indian literature, Scharf has been particularly interested in the intellectual history of Indian linguistics and philosophy of language, in the development of conceptions of the self, and in the creativity expressed through the adaptation of ancient narratives in new contexts.

6.1. Indian philosophy of language

One of the central questions of philosophy concerns the basis of general conceptions. The problem extends from that of identifying an object or one’s self to be the same thing as encountered previously to that of classifying objects as of the same type. Both problems are intimately bound to linguistic usage. Both problems pervade the history of Indian philosophy and particularly the Indian philosophy of language.

Scharf’s first book (§7.1 #1) considered various points of view regarding whether general properties exist independently of conception and serve as the grounds for the use of common nouns. In it he examined the arguments of Buddhists and the counterarguments in Mīmāṁsā and Nyāya regarding the existence of independently existing grounds for the use of words and, in particular, the existence of general properties. He then compared Patañjali’s views and arguments in the Mahābhāṣya with those of Vātsyāyana in the Nyāyasūtrabhāṣya and Śabara in the Mīmāṁsābhāṣya regarding what conditions the use of common nouns. Patañjali and Vātsyāyana conclude that common nouns cause awareness of both the class property and the individual object in which the class property inheres because considerations of gender, number, and actions to be performed necessitate comprehension of individuals of the class. Śabara in contrast concludes that common nouns cause cognition only of class properties and that it is the context, rather than the speech form, that brings individuals of the class to awareness. Śabara’s argument that the use of common nouns for likenesses proves his view because likenesses are not individuals of the class, however, involves the fallacious identification of a class property with shape.

Despite confusion concerning the class properties in early Mīmāṁsā, Scharf’s papers on class properties (ākr̥ti) (§7.3 #1–#2) and on intentionality (vivakṣā) (§7.3 #4) demonstrated that the concepts of a class property and a speaker’s intention could not have undergone the process of historical maturation between the second century bce and the fifth century ce assumed by certain scholars because the concepts of a class property and a speaker’s intention are clearly articulated in Patañjali’s Mahābhāṣya (150 bce). “Pāṇini, vivakṣā, and kāraka-rule-ordering” (§7.2 #1) examined evidence that Pāṇini himself assumed the principle of a speaker’s intention. The article demonstrated that at least one sūtra in the Aṣṭādhyāyī articulates the principle, and that Kātyāyana, as well as Patañjali clearly understood that linguistic convention limits a speaker’s intention in the effective use of language.

The presentations about recognizing speech (§8.1 #4; §8.2 #6) uncovered another case where proper elucidation of Patañjali’s Mahābhāṣya showed that the concept of grasping the speech form prior to its bringing its denoted object to awareness was clearly understood at a period earlier than certain scholars had assumed.

The paper on Kauṇḍabhaṭṭa’s views concerning the semantic conditions for kārakas (§8.2 #4) and the articles dealing with levels (§7.2 #5–#6) demonstrate that Pāṇini does not conceive of distinguishable psychological or procedural levels besides two related domains: meaning and speech. In recent discussions with one proponent of four levels, Kiparsky agreed that by ‘levels’ he meant only distinct modes of reference, not psychological domains nor separate modules.

“Pāṇini’s use of prohibitive compounds” (§7.3 #3) examined the commentaries on sūtras that derive nouns from verbal roots in the Aṣṭādhyāyī in which the term anupasarge occurs, and demonstrated that the term must be understood as a compound in which the elements are not syntactically related (asamarthasamāsa) in order that it have the sense required by the rules.

“The natural-language foundation of metalinguistic case-use in the Aṣṭādhyāyī and Nirukta” (§7.2 #3) concerns the syntax of metalanguage and its conceptual presuppositions in the works of Pāṇini and Yāska. The article demonstrated that the metalanguage in both texts similarly utilizes both genitives and ablatives to signify the speech form that is the derivational source of a derivate. The usage originates in the overlapping domain of usage of the two cases in syntactic connection with direction words and in lineages in ordinary usage.

“The relation between etymology and grammar in the linguistic traditions of early India” (§7.3 #10) addresses the persistent problem of the historical priority of the Aṣṭādhyāyī and Nirukta. The article argues that the Nirukta is a multi-layered text the oldest layer of which has a lexical focus and antedates the Aṣṭādhyāyī but the later layers of which display cognizance of sophisticated systematic procedures of analysis used in Pāṇini’s work and post-date his work.

There was a recent debate concerning whether Pāṇini’s procedure of deriving correct speech forms begins with semantics, or first posits an approximate speech form as a target that it validates or corrects. “On the semantic foundation of Pāṇinian derivational procedure” (§7.3 #8), demonstrated that the former is the case through a careful examination of the derivation of the compound kumbhakāra and the commentaries on the rules involved. The grammar is a systematic projection of semantics onto speech that begins with objects and relations solely within the semantic domain prior to the introduction of any speech form.

“On the source of the cognition of time in verbs” (§7.2 #35) examines whether Indian linguists consider the verbal root or the verbal inflectional suffix to generate the cognition of time in verbal cognition.

6.2. Linguistics

The request to contribute the chapter on Linguistics in India to the Oxford handbook of the history of linguistics (§7.2 #14) recognized Scharf as an authority on Indian linguistic systems.

Pāṇinian linguistic description

Because linguistic terminology and analysis pervade commentary in every field, and linguistic treatises describe the language in which treatises in every field are written, progress in the intellectual history of Indian linguistics promises to have far-reaching influence on Indian intellectual history generally. Scharf has undertaken to evaluate the descriptions of language undertaken by various ancient Indian linguists and to compare these descriptions with extant Sanskrit texts. “Pāṇinian accounts of the Vedic subjunctive,” (§7.3 #7) evaluates the adequacy of competing accounts of the subjunctive to account for Vedic usage and concludes by preferring the account that depends less on escape rules. The paper likewise argues that comprehensive evaluation of the linguistic system and the text is required to evaluate the degree of correlation between the linguistic description and the text and that the procedure of examining selected individual forms and rules used by scholars previously is inadequate because different cases lead to contradictory results.

While the previous paper considered the relation of grammatical rules to Vedic forms, “Pāṇinian accounts of the class eight presents” (§7.3 #6) considered how variation in lists that supplement the set of rules, particularly the dhātupāṭha, alters the linguistic description of the linguistic system that comprises those lists. The paper evaluates the adequacy of competing linguistic accounts of verbal forms proffered by commentators on root-lists (dhātupāṭha) that represent and categorize roots differently. In particular the etymologically infelicitous inclusion of the root r̥ṇ in class eight instead of r̥ in class five allows the linguistic system to account for the appearance of the form arṇavat in the Atharvaveda without modifying any rules.

The recognition that the dhātupāṭha plays an essential role in the Pāṇinian linguistic system led Scharf to analyze the dhātupāṭha given in the Mādhavīya Dhātuvr̥tti in comparison with the rules of the Aṣṭādhyāyī, to restore roots to the canonical form expected by Pāṇinian rules and to prepare a digital edition and index (§7.1 #4-#5; §8.1 #18) (https://sanskritlibrary.org/Sanskrit/Vyakarana/Dhatupatha/index2.html).

In other work on the dhātupāṭha (§8.2 #23), Scharf evaluated arguments for and against the view that the list originally included meanings. Although he supported the view that the list generally did not include meanings, he demonstrated that at least some semantic conditions had to be included to prevent contradictions.

Pāṇinian procedure

In a number of articles, Scharf considered problems exposed by his long-term project of formalizing Pāṇinian procedure. Computational formalization forces one to deal systematically with issues and brings together data in new ways that permits new insights.

Pāṇinian rules occasionally include conditions that refer to a subsequent state of derivation. “Rule-blocking and forward-looking conditions in the computational modeling of Pāṇinian derivation” (§7.2 #12) analyzes one such situation in the derivation of perfect active participles and describes a computational solution that delays the decision to accept the result of applying the rule until the subsequent referenced state has been reached.

More difficult is how to determine which of conflicting rules takes precedence. When the domain of a rule is wholly contained within the domain of another rule, the rule with the narrower domain must take precedence just by virtue of the fact that it was stated; otherwise it would have no scope. However when each of two rules with overlaping domains has scope in its own proper domain some criterion is required to determine which takes precedence in the overlapping domain. “Rule selection in the Aṣṭādhyāyī or Is Pāṇini’s grammar mechanistic?” (§7.2 #13) carefully examines criteria proposed by ancient Indian grammarians and modern scholars but finds no single consistent universal solution. The examination leaves in place the solution proposed by Patañjali, that the desired rule applies, despite the fact that it is deemed unsatisfactory because it renders the operation of the grammar subject to knowledge of its outcomes. The paper concludes that a complete examination of the complex problem, which has and could only be examined partially otherwise, will require the assistance of computational modeling. “On the resolution of conflict between accentual rules and other rules of derivation in Pāṇinian grammar” examines Rāmacandra’s solutions to the conflict between the accentual rule A. 6.1.186 and the single replacement rule A. 6.1.97 in the derivation of the third person plural verbal form pacanti, accepts his conclusion of the priority of the accent rule but by reason of the inclusion of the term upadeśa referring to the state of original instruction, rather than the reasons he gives.

The section of rules in the Aṣṭādhyāyī that introduces stem-forming affixes (vikaraṇa) is interpreted by commentators as containing locatives that refer to the right context of the affix to be introduced. This interpretation is accepted by modern scholars too. Yet such an interpretation necessitates complicating the derivation of accents which in Pāṇinian grammar are adjusted at each step of derivation. “Teleology and the simplification of accentuation in Pāṇinian derivation (§7.2 #16) argues that the complication is removed by interpreting the locatives as locatives of domain rather than right context locatives. With a locative of domain, the rules introduce the stem-forming afixes on condition that certain referenced forms will be introduced later. A decision-delay procedure such as was described in the article on forward-looking conditions above (§7.2 #12) avoids indeterminism in these rules. The result is a more elegant systematic description of accentuation. “On the status of nominal terminations in upapada compounds” (§7.2 #31) reinforces this conclusion by examining rules that form nominal derivates that occur only as the final constituents of compounds. The paper demonstrates that even when the subordinate constituents in such compounds are referred to explicitly with terms denoting nominal terminations, the nominal terminations cannot be present at that stage of the derivation. Conversely, “Are taddhita affixes provided after prātipadikas or after padas?” (§7.2 #32) shows that because Pāṇinian procedure does indeed require nominal terminations on speech forms after which taddhita affixes are introduced, it is necessary to retain application of the term aṅga ‘stem’ conjointly with the term pada ‘word’ in order for required operations to take place, despite the fact that headings under which these terms are introduced indicate that the latter term alone should remain applicable. “A computational implementation of Pāṇini’s derivational morphology of Sanskrit” (§7.2 #36) considers the interplay of formal affixation headings and subordinate headings that state semantic conditions in the taddhita section concerned with the derivation of secondary nominal derivates. The paper argues that Pāṇini operated with a constrained multiple inheritance structure that effeciently maps affixes to meanings to account for the complex homonymy and synonymy of the derivates.

“Modeling Pāṇinian grammar” (§7.2 #5) compares obvious methods to implement a few aspects of Sanskrit grammar computationally, comments upon the degree to which they approach or depart from Pāṇinian methodology and exemplifies methods that would achieve a closer model. The question of levels and the role of semantics are dealt with at some length. The article demonstrates the extent to which the grammar is founded upon semantics and concludes that Pāṇini conceived of just two levels, meaning and sound, generating the latter from the former. “An XML formalization of the Aṣṭādhyāyī” (§7.2 #28) describes the structure of a formalization of the rules of the Aṣṭādhyāyī amenable to computational implementation of those rules while “Some issues in formalizing the Aṣṭādhyāyī” (§7.2 #29) describes various problems in the Pāṇinian description brought to light by the attempt to formalize these rules and the solutions proposed to solve them.

Pāṇinian commentary

Commentaries on Pāṇini’s Aṣṭādhyāyī, besides helping to elucidate the grammar, its procedures, and its description of linguistic phenomena, present interesting exegetical, critical, and historical problems. “Counterexamples (pratyudāharaṇa) in Pāṇinian Grammar” (§7.2 # 33) examines the syntax and function of counterexamples in the Kāśikā commentary, isolates the defining feature of the counterexample and reveals textual problems where counterexamples do not conform to this feature. The systematic examination of variants reported in Jinendrabuddhi’s Nyāsya and Haradatta’s Padamañjarī currently underway and described in a co-authored paper (§7.2 #38) should lead to an improved edition of the Kāśikā.

Computational linguistics

“An analytic database of the Aṣṭādhyāyī” (§7.2 #17) describes the database of Pāṇini’s Aṣṭādhyāyī Scharf created in 1991. The database analyzes sandhi in sūtras, provides morphological identification of each word, analyzes compounds, indicates recurrence of terms, indicates the type of each sūtra and the significance of the case used in each word. It likewise provides extensive classification of grammatical elements including roots, gaṇa-elements, affixes, augments, markers, and pratyāhāras and other technical terms.

Sanskrit presents a number of challenges for computational linguistic processing including rich morphology, an enormous corpus and the fact that sandhi and script conventions obscure word boundaries. The coauthored article, “A distributed platform for Sanskrit processing,” (§7.2 #19) describes innovative solutions to these problems developed through international collaboration. Solutions include efficient segmenting and tagging algorithms, dependency parsers based on constraint programming, and the integration of lexical resources, text archives and linguistic software through distributed interoperable Web services.

Accentuation

Scharf’s article entitled, “Vedic accent: underlying versus surface,” (§7.2 #15) and related papers (§8.1 #12, #47) evaluated the descriptions of accentuation in various Vedic phonetic treatises (prātiśākhya) and compared them with the practices of marking accent in editions and manuscripts of the corresponding Vedic texts. The article distinguished two distinct traditions of recitation commonly confounded, an earlier one, exemplified in the Aṣṭādhyāyī and Vājasaneyiprātiśākhya, in which the high pitch (udātta) is the highest tone and a later one, exemplified in the R̥kprātiśākhya, in which the circumflex (svarita) is recited higher than the high pitch. The traditions agree in marking what is recited highest with a vertical line above but differ in which underlying accent is so marked. Concurrently, under the purview of the digital Sanskrit library project described above, Scharf compiled a detailed account of accent marks and other Vedic characters in order to prepare a proposal to extend Indian script code blocks in the Unicode standard to allow adequate representation of Vedic (§7.2 #8–#11; §7.4 #1-14, §7.6 #17). A recent paper (§8.2 #50) demonstrated that the R̥kprātiśākhya system influenced the descriptions and understanding of accent in other Vedic traditions. Several other papers and presentations consider issues in the derivation of accent in Pāṇinian grammar (§7.2 #15–#16 #34, §8.1 #51, §8.2 #30).

Syntax

“Clause-initial dvayám” (§7.3 #5) corrected the syntactic analysis of a passage containing the term dvayám in the Śatapathabrāhmaṇa by considering relevant Pāṇinian rules, sections in modern grammars and the related ritual texts.

“Interrogatives and word-order in Sanskrit” (§8.2 #7, #11; §7.2 #24) refuted a contemporary theoretical over-generalization that constrained the position of interrogatives to sentence-final position in head-final languages. The paper not only demonstrated that the particular serialization does not hold for Sanskrit but in addition demonstrated that the premise that roles are associated fundamentally with position is misguided, even if there is an established usual order.

Scharf edited a volume of papers presented at the seminar on Sanskrit syntax he organized in Paris 13-15 June 2013, of which the program is available at https://sanskritlibrary.org/syntaxParis/program.html (§7.1 #9). The volume includes two papers for which computational syntactic research on Sanskrit was carried out under his direction (§7.2 #25–#26). The first demonstrated a significant departure in the order of complements and heads in poetry from their order in prose. While agents, objects, adverbs and instruments preceded their head verbs, and qualifiers and genitives preceded what they qualified ninety percent of the time in prose, most complements preceded their heads only sixty-six percent in poetry; qualifiers and instruments preceded their heads only about forty percent of the time. The second paper analyzed the explicit and implicit provisions Pāṇini makes concerning the cooccurrence conditions of preverbs and verbal roots and utilized the analysis in a computational analysis of the Pāṇiniian account of middle voice (through the provision of ātmanepada terminations).

Metrical analysis

A third co-authored paper in the Sanskrit syntax volume presented Sanskrit metrical-analysis software based upon Kedārabhaṭṭa’s Vr̥ttaratnākara (§7.2 #27). A more comprehensive revised version of the software was presented in a co-authored paper at the World Sanskrit Conference in Bankok in 2015 (§8.2 #44), and Scharf presented an analysis of metrics in the Mahābhārata that utilized the software in a comprehensive semi-automated system to prepare digital editions of texts in accordance with the Text-Encoding Initiative Guidelines (§7.2 #37).

Phonetics and encoding

Linguistic issues in encoding Sanskrit (§7.1 #6) surveyed technologies for representing the Sanksrit language in writing (Indic scripts, particularly Devanāgarī script and its standard Romanization) and in digital encoding systems (from legacy fonts and meta-encodings to Unicode), analyzed the history of encoding in terms of transitions in the medium of knowledge transmission, critiqued encoding systems and scripts on the ground of fundamental principles of precision in information transmission, analyzed Sanskrit phonology, and designed accurate and consistent segmental, featural, and ascii-based encoding schemes for Sanskrit. “Linguistic issues and intelligent technological solutions in encoding Sanskrit” (§7.3 #9) makes a concise presentation of the essential issues. Current encoding systems reproduce deficiencies inherent in traditional writing systems. The contemporary use of computers for the manipulation of linguistic and textual data demands more relevant information-processing principles. Encoding is relative to the information to be conveyed in the structure of the langauge represented. Distinctive elements should be encoded consistently and unambiguously. Doing so requires selecting phonic or graphic units, segments or features, and determining precise criteria for contrasting elements.

6.3. Indian Philosophy

The request for his contributions to the Routledge Encyclopedia of Hinduism recognized Scharf’s expertise in soteriological branches of Indian philosophy (Sāṅkhya, Yoga, and Vedānta). He wrote some twenty-seven articles dealing with various systems of Indian philosophy, including the variant lists of which systems there are, and modes of evidence used in them. Prominent are articles on Sāṅkhya, Yoga, karma, ātman, and brahman (§7.2 #4).

6.4. The Self

Scharf has been interested in concepts of the self in both European and Indian philosophy since he was an undergraduate. He developed these ideas during a seminar he taught on Yoga philosophy at the University of Virginia (§9 #9), presented them at lectures at Pennsylvania State University (§8.1 #1), and explored them at Brown in a course on concepts of the self (§9 #4), and in Sanskrit reading courses in the Upaniṣads, Śaṅkara’s commentary on the Bhagavadgītā, Patañjali’s Yogasūtra, and Buddhist philosophy (§9 #12b). This led him to elucidate the nature of consciousness in the Br̥hadāraṇyaka Upaniṣad in his paper on sañjñā (§8.2 #12). In his piece, “Creation Mythology and Enlightenment,” (§8.1 #5) he explored how themes in Indian mythology concerning the origin of the universe complement philosophical themes concerning the full discovery of the self. He has distant plans to retranslate Patañjali’s Yogasūtra and Vyāsa’s Yogabhāṣya on the basis of a study of Nāgeśa’s untranslated commentary. Scharf was invited to present a paper on Yoga at the First International Yoga Conference: organized by the Indian Council for Culture Relations and the Consulate General of India, New York in June, 2018 (§8.1 #71).

6.5. Narrative Adaptation

One of the most enjoyable pursuits in the study of Sanskrit literature is to explore the motivations for the adaptation of narratives in subsequent versions. In advanced Sanskrit reading courses he often traced the interpretation of a Vedic myth in various genres through the history of Indian literature (§9 #12d). One such myth is that of Purūravas and Urvaśī. In a paper entitled, “The Compassionate Urvaśī” (§8.1 #7; §8.2 #5) he reexamined the interpretation of R̥gveda 10.95 in subsequent Indian literature, including in Ṣaḍguruśiṣya’s Vedārthadīpikā. Because Ṣaḍguruśiṣya’s work is full of narratives related to various R̥gvedic hymns, he was led to collect manuscripts of the work for the purpose of making a critical edition. With the assistance of a grant from the American Philosophical Society (§5.2 #2) he collected some sixty manuscripts of the text. Similarly, tracing the interpretation of the story of Rāma in various genres in advanced Sanskrit classes led to his book on the Rāmopākhāna (§7.1 #2), and to papers on the ethics of the final episodes and Sītā’s divinity (§8.1 #22). Editions and even more so translations flatten the texture of a work thereby presenting it as a more integrated whole. Scharf’s work digitally cataloguing manuscripts of the Mahābhārata in the project described in section 5.2 revealed interesting facts about the transmission and constitution of that text. His paper, “Five jewels in the University of Pennsylvania’s Rare Book and Manuscript Library,” examines the relationship of five subsections of the Mahābhārata that were often transmitted as a unit and corroborates the view that these works were originally independent pieces. Exploration of the methods of, and motivations for, the adaptation of ritual practice are equally intriguing. His student Kartik Venkatesh and he analyzed the structure and adaptation of pūjā in five major Indian festivals and have nearly completed a book on the topic.

In a paper presented at the Montreal Mahābhārata Conference and published as a chapter in Rukmani’s book on the Mahābhārata, he argued for publication of multi-level text (§7.2 #2). In contrast to a translation devoid of notes that presents the reader with a single, static, flat version of a narrative, multilevel text that includes comments, notes, alternatives, multimedia accompaniments, etc. can communicate the source work more fully and meaningfully. His book on the Rāmopākhyāna (§7.1 #2) is just such a work.

In numerous presentations and writings he has argued that we are in the midst of a major media-transition (§8.1 #9–#11; §7.1 #6; §7.3 #9). The transition from printed to digital media is comparable to the transitions from oral to written media, and from written to printed media. Productions of human knowledge that don’t convert to the new media recede from public awareness and perish in oblivion. In order to ensure that the vast body of knowledge contained in Sanskrit texts survives the transition to digital media, and to take advantage of the greater range and flexibility of presentation it offers, he founded a digital Sanskrit library (https://sanskritlibrary.org).

6.6. Digital Sanskrit library

In order to facilitate comprehensive comparisons between various ancient Indian linguistic descriptions and Sanskrit texts, and to facilitate general access to Sanskrit literature, lexica, grammars, and manuscripts, he is engaged in a project to develop an integrated international digital Sanskrit library. After receiving several minor grants, he obtained a major grant from the NSF, 2006-2009 (§5.2 #12-#14). The project, funded by the National Science Foundation’s (NSF) Division of Intelligent Systems, integrated the linguistic software modeling Pāṇinian inflection and sandhi rules, developed by Scharf and his late colleague Malcolm Hyman at Brown, with bilingual lexical resources digitized in the Cologne digital Sanskrit lexicon project, and machine-readable Sanskrit texts in the TITUS archive at Frankfurt. In subsequent years he obtained four additional major grants from the National Endowment for the Humanities to catalogue and digitize Sanskrit manuscripts, develop image-text alignment software, and to digitize and integrate Sanskrit lexical resources (§5.2 #15–#18). The Sanskrit Library is building an integrated digital library that allows seamless access to grammatical information and lexical sources by clicking words in texts, access to citations in context by clicking citations in lexical sources, and focused access to sought passages in manuscript images. The system will facilitate linguistic, philological, and topical research in Sanskrit generally much as the Perseus project has in Classical philology.

One task of the Sanskrit library project was to develop encoding standards. Already mentioned is the project to extend Indian script code blocks in the Unicode standard to accommodate special characters in Vedic. The Unicode standard is a script-based encoding. While using Unicode for display, he devised an independent phonology-based encoding scheme to facilitate internal linguistic processing and allow users to choose modes of display including both Roman and Devanāgarī. Hyman and he considered the linguistic issues in coding Sanskrit, or any language, in their book Linguistic Issues in Encoding Sanskrit (§7.1 #6). At https://sanskritlibrary.org, morphological software is located under ‘Tools’, integrated dictionary interface is available under ‘Reference’, and digital texts linked to analytic tools are available under ‘Texts’.

7. Teaching

Great ideas, profound insights, and penetrating discoveries are best appreciated and most inspiring when approached through the original expressions of the sages, scientists, and leaders who first articulated them. This is why foundational texts and great classics continue to attract seekers of knowledge far beyond their time of composition. Learning to access such works, preferably in their original language, guided Scharf’s own education in the history of philosophy in college and in Sanskrit studies in graduate school. Guiding students to appreciate such works for themselves is the guiding principle of Scharf’s teaching.

To gain true appreciation of ideas requires internalizing them so that they become a part of one’s intimate experience. The process of internalization requires active participation in the learning process and utilization of diverse avenues of learning: aural, visual, and hands-on. Thus principles of universal access govern Scharf’s preparation of teaching materials and class structure which include traditional Indian teaching methods. Language materials, for instance, include carefully designed tables, charts, web-materials, and audio files, and class includes interactive oral work and visuals, as well as written exercises. Classes include lectures with prepared slide presentations as well as guided discussion using the Socratic method.

Scharf taught an Introduction to Hinduism (§9 #3), a course on the Yogasūtra and commentaries (§9 #9), and a course on the Aitareya Brāhmaṇa (§9 #12diii) in the Department of Religious Studies at the University of Virginia in the spring of 1992. He taught a general course on South Asian Civilization (§9 #1), a course on Concepts of Self in Classical Indian Literature (§9 #4), and all levels and genres of Sanskrit literature (§9 #11–#12) for nineteen years in the Department of Classics at Brown University, where he was promoted to senior lecturer and served as concentration advisor and chair of the South Asian Studies Committee. There he wrote his own introduction to Sanskrit with audio materials, called Śabdabrahman, for use in his first-year classes. His independent-study reader Rāmopākhyāna: the Story of Rāma in the Mahābhārata (§7.1 #2) has garnered enthusiastic reviews and is widely used in Sanskrit classes. The digital version with its index, found under Pedagogy at the Sanskrit Library website, served as the model to develop two other Sanskrit texts for display in the Kramapāṭha reader: Pūrṇabhadra’s Pañcākhyānaka and Pāṇini’s Aṣṭādhyāyī. A fourth, Viṣṇu Purāṇa, Book 4, awaits final editing. In addition he developed video clips to demonstrate Devanāgarī character formation. Other Sanskrit exercises utilizing intelligent feedback systems that utilize transliteration, sandhi, and inflection software are under development. Scharf taught courses in Indian cultural tradition, historical and comparative linguistics, Indian linguistic theory and Pāṇinian grammar at the Indian Institute of Technology Bombay and the International Institute of Information Technology in Hyderabad. At the latter he currently teaches a course on Indian semantics and ontology and has more than thirty students in his Introductory Sanskrit class.