Data Mining (Graduate)
4005-775
2012-4 Summer Quarter (First 5 Weeks)
Instructor Dr. Trudy Howles
URL http://www.cs.rit.edu/~tmh
I will post my office hours schedule at this URL. Follow the link on my home page to the weekly schedule and assignment page (password proected). I will use myCourses to record grades, and for other course activities.
Meeting Dates Tuesday & Thursday 10:00 - 1:50 - notice our meetings are 4 hours each class
Text

Required: Data Mining: Practical Machine Learning Tools and Techniques, 2nd Ed., by Witten and Frank (Morgan Kaufmann, 2005)

Course Prerequisites 4005-771 (Data Exploration & Management)

Note that 771 is a prerequisite (NOT corequisite). If you do not have the adequate prerequiste you must drop the course.

Overview

This course provides an intruduction to the major concepts and techniques used in mining large databases. Topics include the knowledge discovery process, data exploration and cleaning, classification algorithms, association rule mining, clustering and text mining. The course also focuses on the social and ethical issues related to data mining. Computing projects, a term paper, and presentations are required.

Software

We will be using the open source Weka data mining tool assignments. We may also work with other tools of your choice such as Rattle or R.

Weka is available at www.cs.waikato.ac.nz/ml/weka/index_home.html

We have Weka running on the CS machines. Type weka at the prompt.

Rattle is available at http://rattle.togaware.com


Course Outcomes

At the completion of this course, the successful student will be able to:

  1. identify and apply basic techniques used in knowledge discovery and data mining
  2. describe and apply basic algorithms for data mining
  3. analyze data mining results to reach conclusions or prove/disprove hypotheses
  4. describe the legal and ethical issues involved in data mining

Grading

Exam #1 (Week 2 Thursday) 15%
Exam #2 (Week 4 Tuesday) 15%
Final (Week 5 Thursday) 15%
Term Team Project and Presentation 35%
Data Mining Research Assignment 10%
Homework 10%

Letter grades are determined as follows:

90 and above                A 
80-89  B 
70-79  C 
60-69  D 
Below 60  F 


General Course Policies

Attendance is expected. I am unable to repeat lectures or save notes and handouts for students who do not attend class. If you miss a class, it's up to you to get any missed materials, including handouts or announcements. Please do not email me asking for copies of homework assignments or handouts.

It is not possible for me to carry around tests, homework or projects I returned when you were not in class. If you miss class, it is your responsibility to stop by during my office hours to pick up these items.

My policy is that questions regarding graded work or exam grades must be resolved within one week of the day the work is handed back, after which time the grade will become permanent.

I do not give extra or bonus work to help students raise grades since that would be unfair to the rest of the students in the class who do not have the same opportunity. Besides, extra work, over and above the normal course work, is not likely to be helpful to students having trouble with the normal course work load.

I do not accept late work.

I do not accept any assigment submissions via e-mail, left under my door, etc. You must submit projects or homework as instructed or they will not be considered for credit.


Projects and Homework

Note that I set due dates on all myCourses dropboxes to 11:59 pm!

You will complete a term team project You will also have two homework assignments.

The term team project will consist of several deliverables through the quarter.

The homework assigments will be announced and discussed in class and are to be individual efforts.

Assignments will not be accepted late. Regardless of the excuse, I will not accept them late. Start early.

This is a graduate course so I expect a high quality of work. This includes the following:

    All work will be carefully validated and verified before submitting
  1. All work will be a reflection of your individual effort
  2. Writing assignments will be subject to significant penalties if you submit sloppy work - misspelled words, poor grammar, etc. Be sure you spellcheck and proof read your work before you submit.

Tests

We will have 2 in-quarter exams plus the final exam in this course. All tests are scheduled in advance. Makeup tests will not be given, except in extreme situations. If a test is missed without prior attangements, no makeup will be given.

I am not able to give the final exam early or reschedule the exam. Please plan accordingly.

After reviewing tests in class, I collect and keep them. Only then are grades recorded in my grade book. Anyone not returning a test will receive a zero for it.

If you have a question regarding your test or wish to look at it for any reason, you may do so in my office during office hours, or at any other mutually convenient time.


Your Responsibilities

Learning is a shared responsibility between you me. I promise to come to class well prepared and ready to work; you need to make the same promise to yourself. This means:

  1. You will come prepared to class. This means you will have read the assignments, completed the homework and have your book, paper, pen/pencil.
  2. All the material won't be covered in class -- there is not enough time. This is why homework and project assignments are important for you.
  3. I may include additional topics not covered in the book. If you miss class, it is your responsibility to get notes from another student.
  4. If you miss class, you should arrange to get copies of notes and handouts from another students. It's your responsibility to stay current on any announcements made in class.
  5. Homework and project due dates and exam dates will be announced well in advance. Get the dates into your calendar (PDA, planner, ...) as soon as they are announced.

Academic Honesty

It is a shame that this must be stated at all, but there are always a few students who do not abide by the rules of proper academic conduct. For the record: All course work is expected to be an individual effort. The sharing of any work or work products, by any means, is not allowed.

You may help each other with assignments, within limits, as it can be a good way for each participant to learn. Examples of acceptable help are proofreading drafts, participating in study groups, or brainstorming ideas, designs or concepts.

It is not appropriate to have someone write all or part of an assignment for you or to plagiarize from another source. All assignments are expected to be your original work. Plagiarism is the theft of another person's thoughts or ideas. Plagiarism is cheating. If you reference another person's work or ideas, you must give credit to the original author.

Those who behave in a dishonest or unethical manner in computer science courses, or in their dealings with the Computer Science Department, are subject to disciplinary action. In particular, dishonest or unethical behavior in the execution of assigned work in this course will be treated as follows:

    For this course, I will assign an automatic F grade to anyone caught plagiarizing another person's work -- whether the work of a published author or another student.

    Furthermore, the following action will be taken for each person involved in the incident, whether currently enrolled in the course or not:

    If the student is a computer science major, a letter recording the incident will be placed in the student's departmental file; otherwise, the letter will be forwarded to the student's department chair or program coordinator.

    Violations of the "Code of Conduct..." can also result in suspension, expulsion and even criminal charges.

Incidents of academic dishonesty are uncomfortable for everyone involved. Please speak with me if you are not sure about what is allowed.

Withdrawal Policy

I must abide by the Institute withdrawal policy and will not sign any withdrawal requests after the date set by the Registrar.

Rev. 05/17/13 by tmh