| Home Page |
| Course Page |
Overview
Search Engine Log Analysis
Software Requirements
-- Clarifications
Software Design
Submission Requirements
Grading Criteria
Late Projects
Plagiarism
Resubmission
Write a Java program that uses multiple threads to learn about thread creation, execution, and termination.
Privacy Invaders, Inc. runs PI, the best search engine in the world. The company has a server farm with dozens of web servers that respond to search requests from users all over the Internet. Each web server keeps a log of search requests the server has processed. Each log is a plain text file. Each line of the file contains the following information about one request: date in the format YYYY/MM/DD, time in the format HH:MM:SS, user's IP address in dotted decimal notation, and a list of one or more words that are the search terms the user specified. Each information item is separated from the next by a space character.
Here is an example of a log file named log01.txt:
2012/11/29 13:00:42 1.1.2.3 bill gates 2012/11/29 13:00:45 2.3.5.8 steve jobs 2012/11/29 13:01:02 3.5.8.13 car bomb taliban 2012/11/29 13:05:56 3.5.8.13 nuclear weapon iran 2012/11/30 09:45:16 5.8.13.21 nutcracker 2012/11/30 10:14:07 8.13.21.34 sports car 2012/11/30 10:14:09 3.5.8.13 hezbollah bomb carHere is an example of a log file named log02.txt:
2012/11/29 13:00:32 1.1.2.3 paul allen 2012/11/29 13:02:17 3.5.8.13 car bomb al qaeda 2012/11/30 10:14:17 8.13.21.34 nissan car dealer rochester ny 2012/11/30 10:15:21 3.5.8.13 iraq wmd
Privacy Invaders, Inc. sells a service that lets customers analyze the server logs. The customer specifies a list of one or more search terms. Privacy Invaders, Inc. then analyzes each log file and prints a report for each log file. The first line of the report is the name of the log file. The rest of the report contains one line for each user that did a search containing all the search terms the customer specified. "User" = unique IP address. A search term the customer specified must exactly match a search term in the log file, except uppercase or lowercase does not matter. Each line contains the following information about one user: IP address in dotted decimal notation, a space character, and the number of searches the user performed that contained all the specified search terms. The users are listed in ascending order of IP address.
Here are the reports for the above log files if the customer specifies the search term "car":
log01.txt 3.5.8.13 2 8.13.21.34 1 log02.txt 3.5.8.13 1 8.13.21.34 1
Here are the reports for the above log files if the customer specifies the search terms "car bomb":
log01.txt 3.5.8.13 2 log02.txt 3.5.8.13 1
Note: Ascending order of IP address is defined as follows. The dotted decimal IP address a.b.c.d, where a, b, c, and d are integers in the range 0 to 255, is converted to a number using this formula:
Programming Project 1 will calculate and print a separate report for each of several log files in multiple threads.
java Analyze <searchterms> <filename> ...
Note: This means that the program's class must be named Analyze, and this class must not be in a package.
Note: Here are some example command lines:
java Analyze car log01.txt log02.txt java Analyze car+bomb log01.txt log02.txt log03.txt
Clarifications to the Requirements
Your project submission will consist of a Java archive (JAR) file containing the Java source file for every class and interface in your project. Put all the source files into a JAR file named "<username>.jar", replacing <username> with the user name from your Computer Science Department account. The command is:
jar cvf <username>.jar *.java
If your program uses classes or interfaces from the Computer Science Course Library without changes, then you do not need to include these classes' or interfaces' source files in your JAR file. If your program uses classes or interfaces from the Computer Science Course Library with changes, then you do need to include these classes' or interfaces' source files in your JAR file.
Send your JAR file to me by email at ark@cs.rit.edu. Include your full name and your computer account name in the email message, and include the JAR file as an attachment.
When I get your email message, I will extract the contents of your JAR file into a directory. However, I will not replace any of the source files in the Computer Science Course Library with your source files; your project must compile and run with your files in their own separate directory. (You can do this project without needing to replace any source files in the Computer Science Course Library.) I will set my Java class path to include the directory where I extracted your files and the directory where the Computer Science Course Library is installed. I will compile all the Java source files in your program using the JDK 1.6.0 compiler. I will then send you a reply message acknowledging I received your project and stating whether I was able to compile all the source files. If you have not received a reply within one business day (i.e., not counting weekends), please contact me. Your project is not successfully submitted until I have sent you an acknowledgment stating I was able to compile all the source files.
The submission deadline is Wednesday, December 12, 2012 at 11:59pm. The date/time at which your email message arrives in my inbox (not when you sent the message) will determine whether your project meets the deadline.
You may submit your project multiple times before the deadline. I will keep and grade only your most recent submission that arrived before the deadline. There is no penalty for multiple submissions.
If you submit your project before the deadline, but I do not accept it (e.g. I can't compile all the source files), and you cannot or do not submit your project again before the deadline, the project will be late (see below). I strongly advise you to submit the project several days before the deadline, so there will be time to deal with any problems that may arise in the submission process.
I will grade your project by:
When I run your program, the Java class path will point first to the directory with your compiled class files, followed by the directory where the Computer Science Course Library is installed. I will use JDK 1.6.0 to run your program.
I will grade the test cases based solely on whether your program produces the correct output as specified in the above Software Requirements. Any deviation from what is specified will result in a grade of 0 for the test case. This includes errors in the formatting (such as extra spaces, missing spaces, the use of a tab instead of a space), incorrect upper/lowercase, incorrect punctuation, misspelled words, missing output, and extraneous output not called for in the requirements. The requirements state exactly what the output is supposed to be, and there is no excuse for outputting anything different. If any requirement is unclear, please ask for clarification.
If there is a defect in your program and that same defect causes multiple test cases to fail, I will deduct points for every failing test case. The number of points deducted does not depend on the size of the defect; I will deduct the same number of points whether the defect is 1 line, 10 lines, 100 lines, or whatever.
After grading your project I will put your grade and any comments I have in your encrypted grade file. For further information, see the Course Grading and Policies and the Encrypted Grades.
The log files used to grade the test cases are:
log01.txt
log02.txt
log03.txt
log04.txt
If I have not received a successful submission of your project by the deadline, your project will be late and will receive a grade of 0. You may request an extension for the project. There is no penalty for an extension. See the Course Policies for my policy on extensions.
Programming Project 1 must be entirely your own individual work. I will not tolerate plagiarism. If in my judgment the project is not entirely your own work, you will automatically receive, as a minimum, a grade of zero for the assignment. See the Course Policies for my policy on plagiarism.
If you so choose, you may submit a revised version of your project after you have received the grade for the original version. You are allowed to make one and only one resubmission of the project. However, if the original project was not successfully submitted by the (possibly extended) deadline or was not entirely your own work (i.e., plagiarized), you are not allowed to submit a revised version. Submit the revised version via email in the same way as the original version. I will accept a resubmission up until 11:59pm Tuesday 08-Jan-2013. I will grade the revised version using the same criteria as the original version, then I will subtract 2 points as a resubmission penalty. The revised grade will replace the original grade, even if the revised grade is less than the original grade.
| Course Page |
| Home Page |