- NQueens coarse-grained -N-body force computation Discussed in PCA book Some code here: http://www.cs.princeton.edu/courses/archive/spr01/cs126/assignments/nbody.html - Matrix multiplication using Strassen's formulas Description: http://mathworld.wolfram.com/StrassenFormulas.html Serial: http://www.cs.duke.edu/~alvy/papers/sc98/ Parallel: http://www.cs.utexas.edu/users/rvdg/abstracts/SSUMMA.PPL.html Medium- to coarse-grained - Solving linear equations (use MM + Matrix inversion to solve Ax=B) Medium- to coarse-grained - FFT (*) / DFT http://www.fftw.org - extremely fast serial version http://astro.uchicago.edu/Computing/On_Line/fft/fft.html - parallel version Medium grained to fine grained for FFT - Training/evaluating an evolutionary algorithm 3-D tic-tac-toe code from Chris Conger Can adapt to other games, such as backgammon/checkers (*) http://satirist.org/learn-game/systems/gammon/ Coarse-grained for training, fine-grained for neural network evaluation - Parallel sort Compare with BUPC implementation Use Divide + Conquer techniques (may map better to message-passing paradigm) Medium- to fine-grained (depending on sort algorithm and parallelization technique) - Discrete-event simulation (*) Difficult problem, some related work available - http://citeseer.ist.psu.edu/vee99parallel.html Fine-grained - LaPAcK/ScaLAPACK (*) ScaLAPACK uses MPI, but f77/f90 compiler Can use f2c, but makes messy C code Wide range of parallelization - Quadratice sieve (code from Conger) Medium-grained parallelization - Linear regression/curve fitting linear regression easy case fitting to quadratic equation harder Medium-grained - Parallel Ogg Vorbis audio encoding (*) use a library; split an audio file into large parts and encode each part independently; recombine into one file could be really trivial or really hard to implement (haven't looked at library code) coarse-grained, heavy on communication - Bioinformatics: MPI-BLAST (*) convert code from http://mpiblast.lanl.gov/ 4k lines of C++, probably would take too long to convert