Department of Computer Science
Rochester Institute of Technology
Phone: (585) 475-4536
|[ Home ]||[ News ]||[ Members ]||[ Projects ]||[ Publications ]||[ Software ]||[ Support ]|
GitHub Page (dprl@RIT)
Please Note: where possible we make source code available for our prototypes under the GNU General Public License (GPL). LgEval and CROHMELib are made available under a non-commercial Creative Commons license.
- Whiteboard Video Summarization from the AccessMath project (**released Nov. 2017 for ICDAR**)
- Tangent math search engine (** Tangent-S released July 2017 ** )
- DPRL Natural Scene Text Detector
- min math search interface
- CROHME Handwritten Math Recognition Competition (incl. RIT system source code)
- Freehand Formula Entry System (early math entry prototype)
- USPTO patent figure and part label competition
- Evaluation tools (LgEval, CROHMELib, RSL) for CROHME, structural pattern recognition
AccessMath: Whiteboard Video Summarization
(K. Davila, Nov. 2017) Generating keyframe summaries of lecture videos containing only whiteboard contents. The system works with single-shot recordings of lecture videos. Released as a companion to Kenny's ICDAR 2017 paper on the same work. This work was later used to support keyframe-based video navigation, and cross-modal visual math search (for the Tangent-V (visual) search engine; details: K. Davila's PhD dissertation).
- Demos: We have one demo for navigating lecture videos using keyframe summaries, and a more detailed demonstration of the Tangent-V visual formula search engine being applied to keyframe summaries. This second demo includes:
- Examples of keyframe video summaries
- Video navigation tool that allows traversal by clicking on 'ink' in keyframes
- Two binary image versions of the video, one with the speaker, and one with the speaker removed. This allows only the whiteboard contents to be viewed throughout the video, for example.
- Visual, cross-modal formula search. Tangent-V can search formulas within video summary keyframes and lecture notes (in LaTeX), as well as between rendered LaTeX and handwritten formula images in generated keyframes.
- Source code (analysis + video frame ground truth creation tools): AccessMath_ICDAR_2017_code.zip
- Video annotations/data (256MB): AccessMath_ICDAR_2017_data.zip
- Original videos: Video Recordings
DPRL Natural Scene Text Detector
(S. Zhu, Apr. 2016) [ We've identified issues with this code, and are working to resolve them. ]. The code below was used to produce the results published in Siyu Zhu's 2016 CVPR paper, A Text Detection System for Natural Scenes with Convolutional Feature Learning and Cascaded Classification, which obtained state-of-the-art results on the ICDAR 2015 Focused Scene Text Detection task at the time of publication.
- Static git repository (.zip archive)
Mathematical Information Retrieval
- Tangent-S (July 2017, by K. Davila, R. Zanibbi, A. Kane and F. Wm. Tompa). This version of the Tangent formula search engine supports individual and parallel search of formula appearance and semantics. This version extends Tangent v. 0.3.1 below, and is described in our SIGIR 2017 paper.
- Tangent v. 0.3.1 (May 2016, by K. Davila, R. Zanibbi, A. Kane and F. Wm. Tompa). This is the version described in our NTCIR-12 competition paper, with wildcard support for full subexpressions, and better separation of code for scoring metrics and locating subexpressions with the best match.
- Tangent v. 0.3 (July 2015, by R. Zanibbi, K. Davila, A. Kane and F. Wm. Tompa). You can download the source code and sample results (including .html pages with highlighted hits) below. This is the version described in our SIGIR 2016 paper.
- Tangent v. 0.2 (2014). Nidhin Pattaniyil implemented this extension of the Tangent system to support matrices and prefix subscripts and superscripts. This updated Tangent combines math expression retrieval with a Solr/Lucene text retrieval system, supporting mixed math and text queries.
Please Note: the files below are quite large, in part so that others have a better chance to replicate our results at NTCIR-11 (2014; NTCIR-11 paper)
- Tangent 0.2 Math Expression Retrieval (Source Code)
- Tangent 0.2 Sol/Lucene Modification for Text Search (Source Code)
- Results from NTCIR-11 Math Retrieval Task
- Tangent v. 0.1 demo (2013). A math expression search engine create by David Stalnaker. This online demo searches math expressions in an earlier version of English Wikipedia.
- Source code: GitHub Page
- multimodal math search interface (2011-2015, demo). Supports mouse/touch, keyboard, mouse and (limited) image input). The program runs on tablets, desktops and laptops.
- Interface source code: GitHub Page
- Source code for recognition and other server applications used:git clone http://saskatoon.cs.rit.edu:10001/root/min-server-apps.git
- The handwritten symbol recognizer used by min is available below.
- The image-based symbol recognizer source code is available from GitHub
- Freehand Formula Entry System (FFES) and DRACULAE handwritten math parser (1999 - 2007); early pen-based equation editor (last release: Aug. 10, 2007)
CROHME Handwritten Math Recognition Competitions (web page)
- IAPR TC11 CROHME Web Page (datasets and evaluation tools)
- CROHME InkML file viewer (source code provided with CROHMELib below)
- Handwritten math symbol recognizers (source code)
- Complete systems submitted by the dprl (the 'RIT' team) for the competition:
- Early tools (2011) developed during the lab's participation in the first CROHME (R. Pospesel and K. Hart)
US Patent Office (USPTO) Figure and Part Label Detection Competition
- Paper co-authored by Chris Riedl (Northeastern, former Harvard post-doc), Marti Hearst (UC Berkeley), Siyu Zhu, Richard Zanibbi and researchers from the Harvard-NASA Tournament Lab (Karim Lakhani et al.) describing an online competition for labeling parts in US patent diagram images has been posted on the arXiv.
- The data and source code for the top-5 placing systems in the competition are available through the UCI Machine Learning Repository.
- LgEval: the Label Graph Evaluation library (by R. Zanibbi and H. Mouchere). The library uses labeled directed graphs to represent results for structural pattern recognition tasks. To obtain the current version, issue the following command using git:git clone http://saskatoon.cs.rit.edu:10001/root/lgeval.git
- CROHMELib translation and file viewing utilities for CROHME InkML/MathML files (by R. Zanibbi and H. Mouchere). To obtain the current version, issue the following command using git:git clone http://saskatoon.cs.rit.edu:10001/root/crohmelib.gitEarlier overview for CROHMELib and LgEval is available (from CROHME 2013/2014 version of the tools)
- Recognition Strategy Language (version 2.0, implemented in Standard ML; Programmer's Guide to RSL). Ben Holm wrote this code along with a re-implementation of an American Sign Language video interpretation program using OpenCV for his MSc thesis in 2011 (with contributions to the RSL compiler by Matthew Fluet and Richard Zanibbi), and Chris Sasarak made modifications and extensions in 2012-2013. To obtain the source, issue the following commands using git:git clone http://saskatoon.cs.rit.edu:10001/root/bholm-thesis.gitWe hope to be able to provide the Recognition Strategy Library, a conversion of RSL into a Python library in the coming months (work by Chris Sasarak and Kevin Talmadge) - delayed, hopefully for not too long...
git clone http://saskatoon.cs.rit.edu:10001/root/rsl.git