Cataloguing the Houghton Library’s Indic manuscript collection

Major activities

This project completes the cataloguing of the entire collection of 1,700 Sanskrit manuscripts in the Houghton Library at Harvard University as the first phase of a larger project to catalogue, digitize and integrate them with corresponding digital texts in the Sanskrit Library. The project incorporates data previously collected in the Sanskrit Library’s template that incorporates the American Committee for South Asian Manuscript’s parameters and the Text Encoding Intiative’s Manuscript guidelines. A subsequent project will produce digital images of the manuscripts and integrate them with digital texts for which the Sanskrit Library has the corresponding digital texts.

Using protocols, formats, and software developed in the pilot project at Brown University and the University of Pennsylvania 2009—2013, project personnel align digital images of manuscript pages with the corresponding digital text. Search and display software utilizes the alignment to provide dynamic direct access to individual manuscript pages that contain passages specifically sought and to display them with the context of the sought passage demarcated. The facilities for searching for morphological and prosodic variants of words, which the Sanskrit Library developed for digital text, will thus be extended to the digital manuscript images. This will allow generalized information extraction and search techniques to reach Sanskrit manuscripts. Conversely, a parser that leads from digital text to prosodic and morphological analysis, and to digital Sanskrit-English lexica, will be accessible from the interface that displays passages corresponding to the digital text in the manuscript image.

The potential impact of allowing wide-audience access to the large number of Sanskrit manuscripts is enormous. These texts constitute an enormous body of knowledge in diverse domains that is grossly underrepresented in the Western academic community. The proposed project dramatically increases the accessibility of digitized manuscripts which are currently accessible only to highly-trained specialists. This access will be profoundly valuable to students, scholars, and the wider public concerned with such fields as historical and general linguistics, philosophy and religious studies, pharmacology and medicine, history of science and mathematics, and general history and literature of South Asia.

Project personnel

  • Peter M. Scharf, Project Director
  • Ralph E. Bunker, Technical Director
  • Toke L. Knudsen, Cataloguer, Assistant Professor (SUNY Oneanta)

Grant details

  • period: 1 July 2013 -- 30 June 2016
  • U.S. funding: National Endowment for the Humanities, Division of Preservation and Access, grant number PW-51273-13
  • funding: $195,000
  • location: The Sanskrit Library