PhD (Comp. Sc.), MSc, BMusic, BA (Queen's University, Canada)
Director, Document and Pattern Recognition Lab (dprl)
Department of Computer Science
Rochester Institute of Technology (NY, USA)
I direct the Document and Pattern Recognition Laboratory (dprl) in the Department of Computer Science. I work on problems that involve pattern recognition, machine learning, information retrieval, and human-computer interaction. Before joining RIT, I worked in the Centre for Pattern Recognition and Machine Intelligence, the Diagram Recognition and Medical Computing labs at Queen's University (Canada), Legasys Corporation (a legacy software solution company), and at the Xerox Research Center Webster.
Over the years I have worked on a number of problems, often with the help of great students from the dprl. These include computer-assisted surgical planning, CAPTCHAs, parsing of two-dimensional notations, document analysis and recognition for typeset and handwritten documents, detecting text in natural scenes, evaluation metrics and tools for structural pattern recognition, and information retrieval in documents and videos.
As some examples of specific projects that I have worked on, I contributed to the first publicly available pen-based equation editor (FFES, 1999-2007), the first multimodal math search interface (min, 2011-2014), and the Tangent formula search engine, which introduced a simple symbol pair-based retrieval model (online demo: Tangent v0.1). With Kurt Kluever I created the first Video CAPTCHA (2008). With Harold Mouchere and Christian Viard-Gaudin (Univ. Nantes, France), we produced a new graph model for representing recognition results, which can handle disagreeing segmentations (label graphs, 2011-2014), along with supporting evaluation tools (LgEval, 2013-2016).
Recently my PhD student Siyu Zhu produced a relatively simple algorithm for detecting text in images containing some interesting innovations - this system obtained top results for the ICDAR 2015 Robust Reading Focused Scene Text database (CVPR 2016 paper). Another PhD student, Lei Hu, developed an innovative appearance-based method for parsing handwritten math and other notations (ICFHR 2016 paper). The Tangent formula search engine was recently improved in collaboration with my PhD student Kenny Davila, Andrew Kane (Univ. Waterloo), and Frank Tompa (Univ. Waterloo). Tangent is now fast, scalable, and produces state-of-the-art results (SIGIR 2016 paper, NTCIR-12 paper, Source code).
Math Recognition and Retrieval
For my MSc I created DRACULAE, a program that parses a list of math symbols with locations to produce LaTeX and operator tree output. DRACULAE is part of the Freehand Formula Entry System which is available for download. More recently, the dprl has created the min math entry and search interface, and the accompanying Tangent math search engine as part of an NSF-funded project to create usable math search systems for non-experts. Source code for min and other dprl systems (including all four dprl CROHME handwritten math competition entries) are available online through GitHub.
Here are some other math recognition systems that I know of.
- XPress by Marco Pollanen and his students. This online program has a simple interface for placing symbols using a keyboard and mouse. DRACULAE is used to parse the symbols and produce LaTeX output.
- Natural Log by Matsakis, Miller, and Viola (MIT)
- JIMHR: (Java-Based) Interactive Math Handwriting Recognizer, a merge and port of FFES/DRACULAE and the Natural Log system by Joy-Gong Ho (Acuitus Corp., USA)
- MathFoR project, includes JMathNotes by Ernesto Tapia (Free University of Berlin)
- Infty by M. Suzuki et. al. (Kyushu University, Japan)
- MathJournal by XThink Inc. As far as I know, this is the first commercial pen-based math recognition system
- MathPad by Joseph LaViola (Univ. Central Florida)
- MathBrush by George Labahn et al. (Univ. Waterloo)
- Windows 7 includes handwritten math input (paper by Marko Panic).
- Detexify a web-based application that retrieves LaTeX commands using hand-drawn symbols as queries (by Daniel Kirsch and Phillip Kühl)
- Web Equation by VisionObjects Corporation (France).
- Online math recognizer by Francisco Alvaro (Univ. Valencia, Spain)
Here are some other math retrieval systems that I'm aware of.
- Approach0 (by Wei Zhong, Univ. Delaware - indexes Math StackExchange)
- NIST Digital Library of Mathematical Functions (search engine by Bruce Miller and Abdou Youssef)
- MIaS, WebMIaS (Petr Sojka, Martin Liska et al., Masaryk University (Czech Repulic))
- MathWebSearch (with template editor) and zbMath search engine (for Zentralblatt MATH) (M. Kohlhase et al., Jacobs University Bremen (Germany))
- EgoMath (Jozef Misutka and Leo Galambos, Charles University in Prague (Czech Republic))
- WikiMirs (Xiaoyan Lin et al., Peking University, China)
- MASE: A Math-aware Search Engine (Tam T. Nguyen et al., Nanyang Technical University, Singapore)
- Symbolab (commercial application)
- WolframAlpha (Note: LaTeX queries accetped)
- Springer LaTeX Search (Beta)
About doing research...(R. Zanibbi, Oct. 2014, Edited Aug. 2016)
The cartoons below represent common misconceptions about science and research succinctly. While I genuinely enjoy doing research, I have to remind myself often that good research is a slow and difficult business. Acknowledging the limitations of both your work and yourself is critical, as well as learning to appreciate continual learning and the process of doing research. Writing up and presenting results is relatively quick, usually. Obtaining meaningful results for hard problems can require years of work, and even if one manages to obtain them, the results seldom, if ever, flow from an 'elegant, correct and complete' model that one hopes for when starting a project. If despite all these challenges, one finds they are repeatedly interested in starting new research projects, then they are a researcher.
On a personal note, I 'left' music at the end of my undergraduate studies to do graduate studies in Computer Science. This was to be a simpler and saner way to live, I thought. I then discovered that the day-to-day activities of a researcher and musician/composer are basically the same, with irregular work schedules and locations, meeting and collaborating with people from diverse backgrounds, sometimes working extremely hard for very long hours, performing (aka 'giving a talk' or 'writing a paper'), travel, finding ways to both be and remain creative, and lots and lots and lots of improvisation and hustling.
It seems to me that leading a research lab is a lot like leading a jazz ensemble. You have a fluid stream of people joining and leaving the group, each with different skills and interests. You 'perform' with this group regularly, despite changing personnel (i.e., carry out experiments and other research tasks). A good jazz group tries to push the music, exploring possibilities to produce enjoyable and interesting sounds. The challenge of the leader in such a group is to maintain enough structure and discipline without squashing good ideas and directions from band members during performances. This involves risk, inevitably leading to errors, and some performances fail. Without a proper balance between form and instinct things are boring, or messy. It took awhile for me to understand and accept this, but once I did, I found that I was frustrated far less often. I reasoned that if research is a creative process, then the process will often be unpredictable, and that this uncertainty may be necessary to produce interesting work.
Many of the best ideas and directions for research I've come across were due to classes that I've taught. Teaching and research when done well are complementary, intertwined even. Until you can explain your work briefly and make clear where the remaining holes are, you probably have not reached the limits of your ability to understand your work.
At its best, I think that research involves creating helpful knowledge and tools to share with others. These helpful tools include well designed and documented prototypes, and informative, well-written research papers. Whether or not I ever 'truly' create these with my students and collaborators, they remain the primary goals for my work. I would rather fail trying to do this than stop.