Computing 2020 - Why we must teach database management early

Rajendra Raj
Department of Computer Science, RIT
rkr@cs.rit.edu

ABSTRACT

Recent reports from the National Science Foundation [1] and Microsoft Corporation [2] highlight that further development in the sciences will require the complete integration of computing principles and practices into the very fabric of science, as opposed to the mere application of computing to support scientists in their daily work. At the same time, current developments in diverse sciences have led to the generation of hundreds of terabytes of data: genomic data, particle physics, and astronomical data. All of this data needs to be managed (that is, stored in databases and subsequently retrieved) efficiently, and then analyzed rapidly to enable scientists to 'do' science in the 21st century [3]. In short, future advances in science are dependent on scientists being knowledgeable in database-centric computing [4].

At present, science majors are typically taught computing as a technique to apply to their work with little or no emphasis on databases. The same is also the case with computing majors who are typically introduced to computing using traditional introduction to programming without using databases; in many computing majors, databases remain an advanced elective topic that is usually covered in the final year of the undergraduate curriculum. The current CS1 - CS2 introductory programming sequence in most computer science programs has not changed markedly in the last few decades, although it has moved from pre-structured through structured to object-oriented programming [5]. NSF's indictment is quite severe: "undergraduate computing education today often looks much as it did several decades ago" [1].

In this talk, I will propose an approach using database-centric courses for the first year of computing. The claim is that not only will this approach meet the needs of many non-computing majors, but it can also be effective in serving the needs of computing majors. An inherent goal of this talk is solicit feedback about the feasibility of such an approach.


References

  1. National Science Foundation, CISE Pathways to Revitalized Undergraduate Computing Education (CPATH), 2006. Accessed November 22, 2006. http://www.nsf.gov/pubs/2006/nsf06608/nsf06608.htm
  2. S. Emmett and others. Towards 2020 Science, Microsoft Corporation, 2006. Accessed November 22, 2006. http://research.microsoft.com/towards2020science/downloads/T2020S_ReportA4.pdf
  3. A. Szalay1 and J. Gray, Science in an exponential world, Nature, pp. 413-414, March 23, 2006. Accessed November 22, 2006. http://www.nature.com/nature/journal/v440/n7083/pdf/440413a.pdf
  4. J. Gray, D. T. Liu, M. A. Nieto-Santisteban, A. S. Szalay, G. Heber, and D. DeWitt, Scientific Data Management in the Coming Decade, Microsoft Research Technical Report, MSR-TR-2005-10, January 2005. (Also published as ACM SIGMOD Record 34 (4): 34-41, 2005.) Accessed November 22, 2006. ftp://ftp.research.microsoft.com/pub/tr/TR-2005-10.pdf
  5. P. T. Tymann and R. K. Raj, "Life after AP Computer Science." AP Annual Conference, Orlando, July 2006.
Colloquia Series page.