Alan Kaminsky • Department of Computer Science • Rochester Institute of Technology • 4486 + 2220 = 6706
 Parallel Computing II • 4003-532-70/4005-736-70 • Spring Quarter 2007
Course Page

4003-532/4005-736 Parallel Computing II Module 2 Lecture Notes -- Hybrid SMP Cluster Programming Massively Parallel Problems

Prof. Alan Kaminsky
Rochester Institute of Technology -- Department of Computer Science

Problem: Cryptanalysis Using Exhaustive Key Search

• Plaintext block (128 bits) goes in
• Encrypted ciphertext block (128 bits) comes out
• Uses a 256-bit secret key

• Known plaintext attack
• Plaintext block is known
• Corresponding ciphertext block is known
• Key is not known
• Problem: Discover the key
• Once you have the key, you can decrypt everything (that was encrypted with that key)

• Exhaustive key search
• For every key from 0 to 2256-1, encrypt the given plaintext with that key, and see if you get the given ciphertext

• Partial exhaustive key search
• 256-N bits of the key are known, N bits of the key are unknown
• For every combination of unknown key bits from 0 to 2N-1, encrypt the given plaintext with that key, and see if you get the given ciphertext

• For further information, see Parallel Computing I Module 2 Lecture Notes

Massively Parallel Problems

• Massively parallel problem = Problem where there are no dependencies between the pieces

• On a SMP computer, this means that during the computation, no thread needs to use data created by another thread
• Thus, potential for good parallel performance

• On a cluster, this means that during the computation, no process needs to send messages to any other process
• Thus, potential for good parallel performance

Parallel Program Performance Metrics

• N = Problem size
• Must be directly proportional to the amount of computation

• K = Number of processors

• T(N,K) = Running time
• Depends on problem size and number of processors
• You need to carefully define what is included in the running time -- computation only? I/O also?
• It's best to do several (e.g., seven) running time measurements for the same N and K, then take the median of the results

• S(N,K) = Speed in units of computations/second
• S(N,K) = 1 / T(N,K)
• Depends on problem size and number of processors

• Speedup(N,K) = Speedup -- N stays constant, K increases
• Speedup(N,K) = S(N,K) / S(N,sequential) = T(N,sequential) / T(N,K)
• How much faster can we solve the same problem by going to more processors?
• Ideal speedup = K
• Calculated with respect to the running time for a sequential version of the program, not a parallel version with K=1

• Eff(N,K) = Efficiency -- N stays constant, K increases
• Eff(N,K) = Speedup(N,K) / IdealSpeedup = Speedup(N,K) / K
• Ideal efficiency = 1

• Plots of running time, speedup, efficiency versus K
• See Building Parallel Programs, Chapter 7

Hybrid Cluster SMP Version Performance

• Running time measurements on the Tardis cluster

• Problem size N = 32M, 64M, 128M, 256M keys searched (25, 26, 27, 28 missing key bits)

• Number of parallel processes Kp = 1, 2, 3, 4

• Number of parallel threads per process Kt = 4

• Number of effective processors K = 4, 8, 12, 16

• Running time T (msec) = median of seven program runs

```  N    K       T   Spdup    Eff         N    K       T   Spdup    Eff
---  ---  ------  ------  -----      ----  ---  ------  ------  -----
32M  seq   55573                     128M  seq  223269
32M    4   14083   3.946  0.987      128M    4   55564   4.018  1.005
32M    8    7156   7.766  0.971      128M    8   27947   7.989  0.999
32M   12    4813  11.546  0.962      128M   12   18971  11.769  0.981
32M   16    3700  15.021  0.939      128M   16   14489  15.410  0.963

64M  seq  111175                     256M  seq  442091
64M    4   27898   3.985  0.996      256M    4  110749   3.992  0.998
64M    8   14038   7.920  0.990      256M    8   55576   7.955  0.994
64M   12    9433  11.786  0.982      256M   12   37368  11.831  0.986
64M   16    7146  15.558  0.972      256M   16   27910  15.840  0.990
```

 Parallel Computing I • 4003-531-01/4005-735-01 • Spring Quarter 2013
Course Page
 Alan Kaminsky • Department of Computer Science • Rochester Institute of Technology • 4486 + 2220 = 6706