Alan Kaminsky Department of Computer Science Rochester Institute of Technology 4486 + 2220 = 6706
Home Page
Parallel Computing II 4005-736-70 Spring Quarter 2012
Course Page

4005-736-70 Parallel Computing II
Research Investigation

Prof. Alan Kaminsky -- Spring Quarter 2012
Rochester Institute of Technology -- Department of Computer Science

Research Theme
Web Site
Hard Problem
Literature Search
Programs
Data Collection
Weekly Status
Draft Report
Final Report
Source Code Archive
Draft Presentation
Final Presentation


Research Theme

In the Parallel Computing I course, I said the two chief uses of a parallel computer are speedup -- to solve the same size problem in less time -- and sizeup -- to solve a larger problem in the same time.

I've recently become aware that there's another way to utilize a parallel computer.

For some problems, an algorithm to find the exact solution requires exponential time. Examples include notorious NP-hard problems like the Traveling Salesman Problem (TSP) and Boolean satisfiability (SAT). For other problems, a polynomial-time algorithm to find the exact solution exists, but the algorithm might be slow (e.g. O(n log n) or O(n2) rather than O(n)) or might require intricate data structures or coding. One example is finding a minimum spanning tree of a graph.

However, there are often quick, randomized algorithms to find an approximate solution to such problems. For example, for the TSP, pick a random starting tour and rearrange randomly chosen sections of the tour until no further reductions in the tour length are obtained. The result might not be the shortest possible tour, but it might be close to the optimal tour. Now repeat that a million or a billion times and keep the best tour; the result might be even closer to the optimal tour.

Defining the quality of an approximate solution to be its closeness to the exact solution, doing multiple repetitions of the approximate algorithm tends to improve the quality of the solution. If its quality is high enough, the solution might be useful for practical purposes, even though it is not an exact solution.

Here's where the parallel computer comes in. Instead of spending a certain amount of time to run the exact algorithm (sequentially or in parallel), spend that time to do repetitions of the approximate algorithm in parallel. The parallel computer provides two key benefits in this scenario:

The research theme for the Parallel Computing II course is to investigate using the parallel computer to get a "quality-up" on hard problems using repetitions of quick, randomized, approximate algorithms. You will choose a hard problem and investigate these questions:


Web Site

You must set up a web site for your investigation. You may use the web site in your CS account (public_html directory) or any other publicly-accessible web site.

You must post the following items on your web site as your investigation progresses. These items are described in more detail below.


Hard Problem

You must choose a hard problem to investigate. This problem must:

To find a suitable problem, I suggest reading up on graph algorithms. These are often exponential-time, or polynomial-time but difficult. NP-hard problems that are not graph problems may also be suitable.

You must post a brief description of your chosen problem on your web site. The description can be embedded in the web site itself, or it can be a downloadable Adobe Acrobat (.pdf) document, or it can be a downloadable plain text (.txt) document. I will not accept any other document formats.

The initial web site is worth 5% of your course grade. To receive full credit, I must receive an email from you containing your name and the URL of your web site, with the description of your chosen problem posted, by 11:59pm Monday 26-Mar-2012. Otherwise, you will receive a grade of 0 for the initial web site.

Each student must choose a different problem to investigate. As students send me their web site URLs, I will post links on the course home page so you can see what your classmates have chosen. If you choose a problem the same as or substantially similar to a problem another student has already chosen, I will not accept your problem, and you will have to choose a different problem.

I will be doing an investigation of the Maximum Satisfiability (MAX-SAT) Problem during the course, so you are not allowed to pick that one.


Literature Search

You must do a literature search to find the following:

You will present the findings of your literature search in class during a weekly status report session (see below). You will also present your findings in your draft paper, final paper, draft presentation, and final presentation (see below).


Programs

You must develop two programs to solve your chosen problem:

You may choose to write your two programs either

Both programs must be written for and run on the same machine.

You must keep track of the number of hours you spend writing, debugging, and testing your two programs. You will use this data to address the research question, "How does the effort of writing a parallel approximate algorithm for the problem compare to the effort of writing an exact sequential algorithm for the problem?"


Data Collection

You must define a metric for the quality of the approximate algorithm's solution. The quality metric must be a number between 0.0 and 1.0. The quality metric must be 1.0 if the approximate solution is identical to the exact solution (i.e., the best possible solution) for a given problem. As the approximate solution becomes worse than the exact solution, the quality metric must decrease from 1.0 down to 0.0.

You must run your sequential program on a minimum of three different problem sizes, to collect running time data as well as the exact solutions to the problems. The sequential program's running time on the smallest problem size must be a minimum of 60 seconds.

You must run your parallel program on the same problem instances as the sequential program and for a range of numbers of repetitions, using all the cores on the parallel computer, to collect running time data as well as the approximate solutions to the problems. You can then calculate the quality metrics for the approximate solutions.

You will use this data to address the research questions, "For various sizes of the problem, what happens to the parallel approximate algorithm's solution quality as the number of repetitions increases?" and "For various sizes of the problem, how much faster than the exact sequential algorithm is the parallel approximate algorithm as a function of solution quality?"


Weekly Status

Beginning Wednesday 28-Mar-2012 and continuing for eight weeks, one class session each week will be devoted to reporting the status of the students' investigations. Each weekly status is worth 5% of your course grade. To receive full credit, you must:

I will grade the weekly status on a scale of 0 to 4:
4 = Satisfactory
3 = Needs improvement
2 = Unsatisfactory
0 = Absent, or acceptable status report not posted before start of class

Absences. If you are absent from one of the weekly status report class sessions, you will receive a grade of 0 for that class session unless before the start of the class session you make an alternate arrangement with me. I am normally willing to permit this only for illness or unforeseen personal emergency. However, if you feel you have a valid reason for your absence, please discuss it with me. Appointments, job interviews, career fairs, vacations, trips home, and other scheduled activities are not valid excuses for absence. You have an obligation to this course, and you must schedule other activities so as not to interfere with class sessions.

I expect you to make substantial progress on your investigation each week. If you do not, your weekly status grade for that week will be reduced. Here is what I expect you to have accomplished each week (if you accomplish more, that's great):


Draft Report

You must write a draft report with the results of your investigation and post it on your web site in the form of a PDF file. I will not accept any other document formats. The draft report must be posted by 11:59pm Thursday 03-May-2012.

The draft report must be a complete draft. In other words, your investigation must be completely finished and written up in draft form by 03-May-2012. The only thing left to do might be to revise the style, organization, level of detail, figures, etc. in the draft report.

The draft report must be in a form publishable in a research journal or conference; see any journal or conference proceedings for examples. The draft report must be at most 10 pages long.

The draft report must include this information:

The draft report is worth 10% of your course grade. I will grade the draft report on a scale of 0 to 10, with 10 being outstanding and 1 being unacceptable. If the draft report is not posted on your web site in an acceptable format by the deadline, the draft report will receive a grade of 0.

We will critique each other's draft reports in class on Wednesday 09-May-2012. I will post links to the draft reports on the course web site. I expect you to read the other students' draft reports and be prepared to critique them in class.

Extensions. You may request an extension of the deadline for the draft report. See the Course Policies for my policy on extensions. If the draft report is not posted on your web site in an acceptable format by the extended deadline, the draft report will receive a grade of 0.

Plagiarism. The draft report must be entirely your own work. I will not tolerate plagiarism. See the Course Policies for my policy on plagiarism.


Final Report

You must write a final report with the results of your investigation and post it on your web site in the form of a PDF file. I will not accept any other document formats. The final report must be posted by 11:59pm Monday 14-May-2012.

The final report must be in a form publishable in a research journal or conference; see any journal or conference proceedings for examples. The final report must be at most 10 pages long. The final report must include the same information as the draft report, incorporating any feedback from the critique in class.

The final report is worth 15% of your course grade. I will grade the final report on a scale of 0 to 10, with 10 being outstanding and 1 being unacceptable. If the final report is not posted on your web site in an acceptable format by the deadline, the final report will receive a grade of 0.

Extensions. You may request an extension of the deadline for the final report. See the Course Policies for my policy on extensions. If the final report is not posted on your web site in an acceptable format by the extended deadline, the final report will receive a grade of 0.

Plagiarism. The final report must be entirely your own work. I will not tolerate plagiarism. See the Course Policies for my policy on plagiarism.


Source Code Archive

You must collect all your program source code files, input files, output files, test scripts, and so on into a single archive file, either a JAR file, a ZIP file, or a TAR GZIP file. I will not accept any other archive formats. You must post the archive file on your web site by 11:59pm Monday 14-May-2012.

The source code archive is worth 15% of your course grade. I will grade the source code archive on a scale of 0 to 10, with 10 being outstanding and 1 being unacceptable. If the source code archive is not posted on your web site in an acceptable format by the deadline, the source code archive will receive a grade of 0.

Extensions. You may request an extension of the deadline for the source code archive. See the Course Policies for my policy on extensions. If the source code archive is not posted on your web site in an acceptable format by the extended deadline, the source code archive will receive a grade of 0.

Plagiarism. The source code archive must be entirely your own work. I will not tolerate plagiarism. See the Course Policies for my policy on plagiarism.


Draft Presentation

You must write a draft presentation with the results of your investigation and post the slides on your web site in the form of a PDF file. I will not accept any other document formats. The draft presentation must be posted before the start of class Monday 14-May-2012.

The draft presentation must cover the same information as the draft report and final report. You will have a maximum of 20 minutes for the presentation.

You will give the draft presentation in class on Monday 14-May-2012. You will project the slides from your own laptop using the classroom PC projector; if you don't have a laptop, borrow one. We will critique each other's draft presentations. The draft presentation will be graded as part of the weekly status grade for that day.

Absences. If you are absent the day of the draft presentation, you will receive a grade of 0 for the draft presentation unless before the start of the class session you make an alternate arrangement with me. I am normally willing to permit this only for illness or unforeseen personal emergency. However, if you feel you have a valid reason for your absence, please discuss it with me. Appointments, job interviews, career fairs, vacations, trips home, and other scheduled activities are not valid excuses for absence. You have an obligation to this course, and you must schedule other activities so as not to interfere with class sessions.

Plagiarism. The draft presentation must be entirely your own work. I will not tolerate plagiarism. See the Course Policies for my policy on plagiarism.


Final Presentation

You must write a final presentation with the results of your investigation and post the slides on your web site in the form of a PDF file. I will not accept any other document formats. The final presentation must be posted before the start of class Wednesday 16-May-2012.

The final presentation must cover the same information as the draft report and final report, incorporating any feedback from the critique in class. You will have a maximum of 20 minutes for the presentation.

You will give the final presentation in class on Wednesday 16-May-2012. You will project the slides from your own laptop using the classroom PC projector; if you don't have a laptop, borrow one.

The final presentation is worth 15% of your course grade. I will grade the final presentation on a scale of 0 to 10, with 10 being outstanding and 1 being unacceptable. If the final presentation is not posted on your web site in an acceptable format by the deadline, or if you are absent, the final presentation will receive a grade of 0.

Absences. If you are absent the day of the final presentation, you will receive a grade of 0 for the final presentation unless before the start of the class session you make an alternate arrangement with me. I am normally willing to permit this only for illness or unforeseen personal emergency. However, if you feel you have a valid reason for your absence, please discuss it with me. Appointments, job interviews, career fairs, vacations, trips home, and other scheduled activities are not valid excuses for absence. You have an obligation to this course, and you must schedule other activities so as not to interfere with class sessions.

Plagiarism. The final presentation must be entirely your own work. I will not tolerate plagiarism. See the Course Policies for my policy on plagiarism.

Parallel Computing II 4005-736-70 Spring Quarter 2012
Course Page
Alan Kaminsky Department of Computer Science Rochester Institute of Technology 4486 + 2220 = 6706
Home Page
Copyright © 2012 Alan Kaminsky. All rights reserved. Last updated 06-Mar-2012. Please send comments to ark­@­cs.rit.edu.