In a
recent post
, I talked about
for
-loops in MATLAB and how to optimize their use knowing how MATLAB stores arrays in memory. Today I want to talk about getting ready for parallel computation, specifically using parallel
for
-loops, via
parfor
.
En route to creating code suitable for running in parallel, sometimes we take code with a
for
-loop and simply replace it with a parallel loop, using
parfor
. That is, if we can't vectorize the code well first. This transformation from
for->parfor
works really well sometimes, but does not always work, and for very good reasons.
For example, you can't simply replace
for
with
parfor
if loop iterations are not completely independent. You can find out more conditions
here
. There is a notable exception to this rule,
reduction variables
, (a
reduction variable
accumulates a value that depends on all the iterations together, but is independent of the iteration order). The documentation for this is great. Here's a quick pseudo-code that would parallelize just fine, even though we reuse the variable
s
在right and left-hand sides. As long as the result is not numerically sensitive to the order in which the accumulation occurs, you will get the "right" answer here whether you use a
for
or a
parfor
loop.
Testing for Parallelizability
I doubt parallelizability is word, but I think you know what I mean!
One simple way to see if your loop might be suitable for parallelization is to try running the loop in another order to see if you get the same result. If so, it is a candidate for
parfor
. The easiest version of this might be running the loop backwards. If you get the same results as the standard sweep, you should be good to go. Instead of backwards, you can also try replacing the loop with the counter in random order; this can be achieved easily using
randperm
.
Note: clearly this calculation can be vectorized easily. That's not my focus here.
qforward = zeros(size(k));
qforward(ind) = k(ind) - meank;
qbackward = zeros(size(k));
qbackward(印第安纳州)= k(在d) - meank;
qrandom = zeros(size(k));
qrandom(ind) = k(ind) - meank;
aretheyequal = isequal(qforward, qbackward, qrandom)
In case the answers aren't exactly equal, you could compare them in a way similar to this.
nearlyfb = norm(qforward-qbackward) < tol
nearlyfr = norm(qforward-qrandom) < tol
nearlybr = norm(qbackward-qrandom) < tol
So this calculation is ready to try with
parfor
.
qpar(ind) = k(ind) - meank;
aretheyequal = isequal(qpar, qforward)
What if I Have an Expression with Iteration Dependency
You may not always be able to parallelize your code. However, there are some conditions under which you can convert your code to take advantage of parallelization by refactoring your calculations a little Let me show you an example. In this case, I will do some of the accumulation after the loop is complete.
Here's the
for
loop equivalent, no warning.
x(ind) = x(ind-1) + k(ind)^2;
And here's how you might parallelize it
aretheyequal = isequal(x,xx)
Thoughts?
Do you often work with large datasets or simulations where getting access to hardware suitable for running code in parallel helps your work scale? Did your code fall into any of the simple categories I mentioned in this post? I would love to hear from you about any issues you have wrangling your code into a form suitable for parallelization. Post your thoughts
here
.
コメント
コメントを残すには、ここをクリックして MathWorks アカウントにサインインするか新しい MathWorks アカウントを作成します。