Who's Using What: Rashmi Singhal Harvard University

Who's Using What is a blog series by Gregory Markus from The Netherlands Institute for Sound and Vision as part of EuropeanaTech. The idea behind the series was to raise awareness about the brilliant open source software options now available to institutions, and to encourage collaboration in the digital heritage community. What better way to show off these tools than talk to the developers making use of and developing them? You can also find lots of OS options in the new EuropeanaTech FLOSS Inventory.

For this edition of Who's Using What we reached out to Rashmi Singhal of Harvard University. Rashmi, along with being a developer at Harvard also serves as technical lead for the IIIF platform, Mirador.

1. What open source tools are you currently working with?

I work on various digital humanities projects at Harvard University. Often, these projects already have legacy databases and data entry workflows set up, so I try to find ways to expose the data in ways that make it easier to use the data without changing the workflow for the users that use the databases. I usually use a stack that includes Django and ElasticSearch to create an API into those legacy database. I like to use Varnish to improve performance where I can.

2. What open source tools have you used in the past to develop larger applications?

I have used XML databases, such as eXistDB, when working with XML documents. I was also doing quite a bit of OCR work with open source OCR engines like Tesseract to aid in the creation of TEI documents of ancient Greek, Latin, and Arabic texts.

3. What are you currently developing?

I am working with an Egyptologist at Harvard University who has an extensive archive of data from various excavations of the Giza Necropolis. By building an API on top of that data, we will soon be able to do interesting analysis and visualizations of the data that were not easy to do in the past.

4. What would you like to see developed?

I would love to build on the work with the Egyptologist to find ways to connect distinct archaeological datasets that are owned by different institutions and use diverse schemas. A big hurdle is that often these datasets are either not digitized or not publicly available, but for those that are, moving towards interoperability would really improve the experiences of students and scholars.