I am a Professor of Computer Science at RIT, where I direct the Document and Pattern Recognition Lab.  I hold a PhD and Master's in Computer Science, a BA with a minor in Computer Science, and a Bachelor of Music degree from Queen's University, Canada.

My research interests include information retrieval, document recognition, pattern recognition, and machine learning. Recently I co-authored a book on mathematical information retrieval for Foundations and Trends in Information Retrieval (pdf link can be found below),  and gave a brief invited talk on the dprl lab's work at a theoretical physics and AI symposium held at the Perimeter Institute in April 2025.

I served as Program Co-Chair for ICDAR 2023, and  chaired the ICFHR 2018, DRR 2012, and DRR 2013 conferences in the document recognition research community. I am also an Associate Editor for the IJDAR and Pattern Recognition journals, and serve on program committees for information retrieval conferences (e.g., SIGIR, and the newer SIGIR-AP conference).

At RIT I have recently taught courses on Information Retrieval (grad/undergrad) and Machine Learning (undergrad). I was also the head of the AI Cluster within the Computer Science department from 2023-2025.

I am serving as Chair of the RIT Faculty Senate for 2025-2026. 

Please click on the links above for more information regarding my research, teaching, software and data from the dprl, and resources for students. Some recent news is included below.


News

  • (April 2025) I was honored to receive the Trustees Scholarship Award, the highest research award for RIT faculty at a ceremony held April 9th at the RIT Field House (program is available here). 
  • (April 2025) The dprl lab's systems for math and chemical search were featured in a short talk that I gave at the Theory + AI Symposium held at the Perimeter Institute in Waterloo, ON, Canada. I also participated in a panel discussion on processing the data of theoretical physics that included researchers from arXiv and Univ. Bonn. (videos)
  • (April 2025) SIGIR 2025 papers. The dprl lab have two demo papers on chemical information retrieval accepted for SIGIR 2025: "Multimodal Search in Chemical Documents and Reactions" - demo web page, and "Targeted Multi-Modal Passage Search for Molecules and their Synthesis Pathways" - demo web page. Congratulations to PhD students Ayush Kumar Shah, Abhisek Dey, Bryan Amador, and Patrick Philippy for their interesting graphics recognition and mutli-modal retrieval work for these systems. Additional information, live demos, and introductory videos may be found here.
  • (Jan 2025) My PhD student Ayush Kumar Shah has accepted a Research Scientist position with Meta starting this coming summer.
  • (Jan 2025) My book on mathematical information retrieval with Behrooz Mansouri and Anurag Agarwal is finally complete. The actual, final version of the book can be obtained as a PDF here. My thanks to FnT publisher for making it possible to share this freely.  
  • (May 2024) I was happy to collaborate on a new  survey with Masaki Nakagawa's group and Harold Mouchère: A Survey of Handwritten Mathematical Expression Recognition: The Rise of Encoder-Decoder and GNN Models. The paper is available for free from the Pattern Recognition journal here until June 21st. A preprint of the final paper is also available here.
  • (Jan 2024) Three students from my information retrieval class, Ben Giacalone, Greg Paiement, and Quinn Tucker have published an interesting paper on the role of the [MASK] tokens in the ColBERT retrieval model, which will be presented at the European Conference on Information Retrieval (ECIR) this March in Glasgow, Scotland.
  • (Nov, 2023)  I have posted a small python debugging library on GitLab that was created for my classes and the dprl lab. The library is organized around pretty-printed debug checks/tests with descriptive messages. I've  called it the Message-Oriented Debugging Library for Python (msg_debug). It avoids the need to repeatedly add/remove print, input, and assert statements to check values and types, and provides functions to record and report execution times when our program requirements keep changing, and bugs abound.

Richard Zanibbi's Home Page (RIT)