4003-532/4005-736 Parallel Computing II
Module 2 Lecture Notes -- Hybrid SMP Cluster Programming
Parallel Datastore Querying
Prof. Alan Kaminsky
Rochester Institute of Technology -- Department of Computer Science
Datastore Querying
- Datastore = A collection of pre-computed data that can be queried
- A more general and abstract concept than "database"
- Examples of datastores
- Flat file, plain text
- Flat file, binary (e.g. java.io.DataOutput, java.io.ObjectOutput)
- Database
- Data computed "on the fly"
- Query = A question to be answered using data in the datastore
- Computing the answer may require very little computation
- Computing the answer may require lots of computation
- Parallel datastore querying
- To get a speedup or sizeup when . . .
- Size of datastore is large, or . . .
- Number of queries is large, or . . .
- Computation needed for each query is large, or . . .
- All of the above!
- Strategy 1
- Size of datastore is small (can fit in main memory)
- Replicate the datastore in all processes
- Partition the queries among the processes
- Strategy 2
- Size of datastore is large (can't fit in main memory)
- Partition the datastore among the processes
- Replicate queries to all processes
- Combine (reduce) partial query results from each process
Real-world examples of parallel datastore querying
- Biological sequence matching -- DNA sequences, protein sequences
Prime Finding Using Parallel Datastore Querying
- Problem: Find all primes up to 2N, N <= 32
- Use Strategy 1
- Datastore = List of all odd primes up to 216 (there are only 6,541 of them)
- Datastore replicated in each process, computed on the fly
- Query = A block of 216 numbers -- 0..65535, 65536..131071, etc. up to 2N
- Query computation = Attempt to find a factor of each odd number in the block, using trial division by the primes in the datastore
- If there are no factors, the number is prime
- Store query results (list of prime numbers) in an output file
- Sequential program
- Hybrid SMP cluster parallel program
- Package edu.rit.hyb.prime
- Master-worker pattern for process parallelism
- Master sends a query block to worker
- Worker sends list of primes in that block to master
- Master writes list of primes to output file
- Master-worker pattern gives load balancing
- Parallel loop pattern for thread parallelism in each worker
- Numbers in query block tested for primality in parallel
- Guided schedule for load balancing
- Example: Generate primes up to 2N for N = 26
java -Dpj.np=$KP -Dpj.nt=$KT edu.rit.hyb.prime.Prime32Hyb primes.dat $N
- Running times for the above example on the Tardis hybrid SMP cluster parallel computer
- K = Kp * Kt = Calculation parallelism
Due to Kp
N Kp Kt K T (msec) Spdup Effi.
---------------------------------------
26 seq 255081
26 1 1 1 252264 1.011 1.011
26 2 1 2 125973 2.025 1.012
26 3 1 3 83283 3.063 1.021
26 4 1 4 63116 4.041 1.010
26 5 1 5 50489 5.016 1.003
26 6 1 6 42217 6.042 1.007
26 7 1 7 36118 7.062 1.009
26 8 1 8 31577 8.078 1.010
26 9 1 9 28208 9.043 1.005
26 10 1 10 25375 10.052 1.005
Due to Kp+Kt Due to Kt
N Kp Kt K T (msec) Spdup Effi. Spdup Effi.
-----------------------------------------------------
26 1 4 4 62952 4.052 1.013 4.007 1.002
26 2 4 8 31650 8.059 1.007 3.980 0.995
26 3 4 12 21502 11.863 0.989 3.873 0.968
26 4 4 16 16157 15.788 0.987 3.906 0.977
26 5 4 20 13168 19.371 0.969 3.834 0.959
26 6 4 24 10956 23.282 0.970 3.853 0.963
26 7 4 28 9480 26.907 0.961 3.810 0.952
26 8 4 32 8335 30.604 0.956 3.788 0.947
26 9 4 36 7320 34.847 0.968 3.854 0.963
26 10 4 40 6777 37.639 0.941 3.744 0.936
- Example: Generate primes up to 2N for N = 32
- 32 minutes on all 40 CPUs of the tardis cluster
- A whole day (21.4 hours) if done sequentially
java -Dpj.np=10 -Dpj.nt=4 edu.rit.hyb.prime.Prime32Hyb primes.dat 32
Job 76, dr04, dr08, dr05, dr06, dr07, dr01, dr02, dr09, dr00, dr03
1921092 msec 8
1921094 msec 9
1921145 msec 5
1921387 msec 6
1921400 msec 2
1921481 msec 4
1921500 msec 7
1921516 msec 3
1921508 msec 1
203280221 primes
1921536 msec 0
|
Parallel Computing II
|
|
•
|
|
4003-532-70/4005-736-70
|
|
•
|
|
Spring Quarter 2007
|
|
Course Page
|
|
Alan Kaminsky
|
|
•
|
|
Department of Computer Science
|
|
•
|
|
Rochester Institute of Technology
|
|
•
|
|
4486 +
2220 =
6706
|
|
Home Page
|
Copyright © 2007 Alan Kaminsky.
All rights reserved.
Last updated 09-Apr-2007.
Please send comments to ark@cs.rit.edu.
|