Math search for the masses.

Overview

           **Project concluded in Fall 2022. Please visit the DPRL web page for information about follow-on work (e.g., with MathDeck)

We are creating a system to make finding mathematical information easier. We want students of all ages and the general public to be able to quickly lookup unfamiliar symbols, and see how formulas are defined, used, and analyzed in online resources like Wikipedia, Math StackExchange, and technical document collections such as CiteSeerX. 

These technologies will also be useful for math experts, and for exploring how math is used within and across disciplines. For example, a mathematician studying graph theory could use our system to find related applications in physics, ecology, and social networks.  

News

  • (Oct) Richard Zanibbi gave a talk for the Topos Institute on Mathematical Information Retrieval on Oct. 27th. The Topos colloquium series, including a YouTube link for the talk may be found online here: https://topos.site/topos-colloquium.
  • (Aug) Behrooz Mansouri's paper on contextualized formula search using MathAMR has been accepted for publication at CIKM 2022.
  • (Aug) Behrooz Mansouri defended his dissertation on math-aware search on July 26th, and submitted the final document in early August. Congratulations to Behrooz on completing his PhD! The dissertation is available here
  • (July) ARQMath-3 is now complete. A big thanks to all who participated, and to our student assessors from RIT.
  • (May) As part of an independent study in Spring 2022, JP Ramissini has created a new YOLO-based detector for math formulas and chemical diagrams (GitLab link).
  • (May) Abhisek Dey is completing an internship at the University of Illinois this summer (as part of the MMLI project).
  • (May) Ayush Kumar Shah will be completing an internship with Amazon this summer as an Applied Research Scientist intern in Computer Vision / Deep Learning.
  • (May) Shaurya Rohatgi is completing another internship at AllenAI this summer.
  • (May) Congratulations to Matt Langsenkamp, who has been admitted to the PhD program at RIT.
  • (May) Congratulations to Abhisek Dey, who succesfully completed his comprehensive (Research Potential Assessment) for the RIT PhD program.
  • (March) Congratulations to Behrooz Mansouri, who will be holding a tenure-track faculty position in the Computer Science Department at the University of Southern Maine!

MathDeck @ CHI 2021

An updated version of our math-aware search interface with multimodal LaTeX editing (MathDeck) was demonstrated at CHI 2021 (demo video).
Note: MathDeck works best with Google Chrome. 

ARQMath-3 @ CLEF 2022 

The third Answer Retrieval for Questions on Math (ARQMath -- Overview Video) Lab was run for CLEF 2022. Visit the task web page and twitter page for more information. ARQMath-3 had tasks for mathematical question answer retrieval, math formula retrieval, and a new open-domain question answering task.

Research Goals

To be successful, we need to create innovative search engines, interfaces, and algorithms for extracting and recognizing math. Here are the research topics we are currently working on: 
  • How people search for math online
  • Search interfaces with easy formula authoring, easy inclusion of math in queries, and that present search results so they are easily read, organized, and reused 
  • Indexing and search techniques for individual formulas
  • Indexing and search techniques for document collections that contain both text and math, with support for queries that combine keywords and formulas
  • Fast and accurate recognition of math in handwriting and images
  • Fast and accurate extraction of math from web pages and technical documents (including PDF files, which do not represent the locations or content of formulas)

Related Activities

  • ARQMath. The Answer Retrieval for Questions on Math task is being run for a third time as part of CLEF 2022. See the ARQMath web page and Twitter page for updates.  ARQMath-3 is now complete; registration for CLEF 2022 ends in late August (early registration is due July 20th).
  • CROHME + TFD 2019 Competition.  In 2019 we again ran an international competition organized around data and evaluation tools concerned with advancing the state-of-the-art in handwritten formula recognition, and detecting formulas in document images. Mashad Mahdavi and Richard Zanibbi co-organized the competition along with Harold Mouchère (Univ. Nantes, France) and Utpal Garain (ISI, India).  The ICDAR paper on the competition is available.

The MathSeer Team


MathSeer is being developed through a collaboration of students and faculty at the Document and Pattern Recognition Lab at the Rochester Institute of Technology, and the Intelligent Information Systems Research Laboratory at PennState, along with faculty from the RIT Math department and the Computational Linguistics and Information Processing Lab at the University of Maryland, College Park. 

Our multi-disciplinary team includes recognized experts in Information Retrieval, Pattern Recognition, Mathematics, and Math Education. Additional information may be found on the Members page.  

Support


The MathSeer project is made possible through research grants from the Alfred P. Sloan Foundation and the National Science Foundation (USA). All materials on this website reflect the work and opinions of the project team, and not the Alfred P. Sloan Foundation or the NSF. 

MathSeer is supported by the Alfred P. Sloan Foundation and the National Science Foundation