Introduction to Digital Philology

  • UT101
  • With the current transition of knowledge transmission from the printed to the digital medium, it is essential that students of Sanskrit, as students of every discipline, learn to use the technologies of this new medium. While the written manuscript and the printed book dominated as the principal media of knowledge transmission it was essential for students to learn how to read and write and to use pen and paper, and the facilities that help to navigate books such as tables of contents, alphabetical indices, and page numbers. In the digital age now there are new facilities to navigate knowledge in the digital medium: Web-pages, links, and search interfaces. Yet the digital medium offers many more complex methods of discovering knowledge relevant to one’s interests, such as automated information retrieval and information extraction. Conversely, the facilities available for interlinking resources and the expectations of users now in the digital age demand greater attention to various facets of the preparation of knowledge for distribution. The course offered here prepares students to make such preparations for Sanskrit.
  • Instructor: Tanuja P. Ajotikar and Peter M. Scharf
  • Schedule: 9th September – 9 December 2022.
  • Course meeting times: Wednesday and Friday 9:30 am U.S. Central Time; 8:00 pm IST; from 6 November 2022 9:00 pm IST.
  • Prerequisite: Basic knowledge of Sanskrit.
  • Course fee: $1500.
  • Course fee for Indian residents: INR 15000.
  • Register (Non-Indian residents)
  • Register (Indian residents)
    1. Course materials:
    2. Huet, Gérard and Idir Lankri. 2018. “Preliminary design of a Sanskrit corpus manager.” Computational Sanskrit and Digital Humanities: selected papers presented at the World Sanskrit Conference, University of British Columbia, Vancouver, 9–13 July 2018, ed. by Gérard Huet and Amba Kulkarni, pp. 259–76.
    3. Ajotikar, Tanuja P. and Peter M. Scharf. 2022. “Development of a TEI standard for digital Sanskrit texts containing commentaries.” 16th International conference on natural language processing. Hyderabad: International Institute of Information Technology.
    4. Ajotikar, Tanuja P., Anuja P. Ajotikar, and Peter M. Scharf. 2018. “Enriching the digital edition of the Kāśikāvrtti by adding variants from the Nyāsa and Padamañjarī.” Computational Sanskrit and Digital Humanities selected papers presented at the World Sanskrit Conference, University of British Columbia, Vancouver, 9–13 July 2018, ed. by Ge ́rard Huet and Amba Kulkarni, pp. 207–18.
    5. Peter M. Scharf. 2018. “TEITagger: raising the standard for digital texts to facilitate interchange with linguistic software.” Computational Sanskrit and Digital Humanities: selected papers presented at the World Sanskrit Conference, University of British Columbia, Vancouver, 9–13 July 2018, ed. by Ge ́rard Huet and Amba Kulkarni, pp. 169–91.
    6. Scharf, Peter M. and Malcolm D. Hyman. 2011. Linguistic issues in encoding Sanskrit. Delhi: Motilal Banarsidass.
Lecture Topic
1 Week 1 & Typing with ten fingers.
2 Week 2 & Computer desktop folders and files, file types (txt, doc, rtf, docx, pdf, jpg, png), searching within files and across files. Geany, Notepad++ (Windows), Gedit (Ubuntu), BBedit (Mac), MSWord, OpenOffice etc.
3 Media transition. Reading: Ch. 1 Scharf and Hyman 2011
4 Unsuitable character encoding: WX, Velthuis, KH, ITrans, IAST, ISO 15919, Unicode Devanagari. Reading: Ch. 2 Scharf and Hyman 2011
5 The basis for encoding (1hr). Reading: Ch. 4 Scharf and Hyman 2011; Sanskrit phonology (2hrs). Reading: Ch. 5 Scharf and Hyman 2011; Appendices A1--7
6 Principles of constrastive phonology, ideal character encoding: SLP. Reading: Reading: Ch. 6 Scharf and Hyman 2011; Appendices B--C
7 What is Philology?
8 Philology in the digital era
9 Principles of text-encoding, and introduction to TEI. Reading: Scharf 2016, 2018b
10 TEI text-encoding of prose and verse
11 TEI bibliography markup
12 TEI inflectional tagging. Reading: Huet and Lankri 2018
13 TEI lexical tagging
14 Introduction to regular expressions
15 Using regular expressions
16 Tagging the base text along with the commentary in TEI