File Exchange Pick of the Week

Our best user submissions

Create Persistent Resources on Parallel Workers

Sean's pick this week isWorkerObjWrapperby MathWorks'Parallel Computing Team.

Background

MATLAB'sParallel Computing Toolboxprovides you with the the ability to open a pool of MATLAB workers that you can distribute work to with high level commands likeparfor.

When communicating with the workers in this pool, there will always be an overhead in data communication. The less data we can transmit to the workers the better speed improvements we'll see. This can be difficult when working with large arrays and can actually cause parallel computations to be slower than serial ones.WorkerObjWrapperhas provided a convenient way to make data persist on a worker; this could be large arrays, connections to databases or other things that we need on each iteration of aparforloop.

Let's See it In Action

We're going to pull some financial data fromYahoo!using the connection from theDatafeed Toolbox.

I have a list of securities and the corresponding fields I want from them:

% Securities and desired fieldssecurities = {'MAR','PG','MSFT','SAM',...'TSLA','YHOO','CMG','AAL'}; fields = {'High',{'low','High'},'High','High',...{'Low','high'},'Low',{'low','Volume'},'Low'};

I first want to make sure there is an open parallel pool (parpool) to distribute computations to. I have a two core laptop, so I'll open two local workers by selecting the icon at the bottom left hand side of the desktop.

I've written three equivalent functions to pull the prices from Yahoo!

  • fetchFOR- uses a regular for-loop to fetch the prices
  • fetchPARFOR- uses a parallel for-loop
  • fetchWOWPARFOR- uses a parallel for-loop and WorkerObjWrapper to make the connection on all workers.

First, a sanity check to make sure they all do the same thing:

ff = fetchFOR(securities,fields); fp = fetchPARFOR(securities,fields); fw = fetchWOWPARFOR(securities,fields); assert(isequal(ff,fp,fw));% Errors if they're not equal

Since the assertion passed, meaning the functions return the same result, we can now do the timings. I'll usetimeit.

t = zeros(3,1);% Measure timingst(1) = timeit(@()fetchFOR(securities,fields),1); t(2) = timeit(@()fetchPARFOR(securities,fields),1); t(3) = timeit(@()fetchWOWPARFOR(securities,fields),1);% Show resultsfprintf('%.3fs %s\n',t(1),'for',t(2),'parfor',t(3),'parfor with WorkerObjWrapper')
8.631s for 5.991s parfor 4.255s parfor with WorkerObjWrapper

So we can see that creating the connection once on each worker in the parallel pool and then usingparforgives us the best computation time.

Comments

Do you have to work with large data or repeat a process multiple times where parallel computing might help? I'm curious to hear your experiences and the challenges that you've faced.

Give it a try and let us know what you thinkhereor leave acommentfor our Parallel Computing Team.




Published with MATLAB® R2013b

|

Comments

To leave a comment, please clickhere你MathWorks账号登录,或者创建一个new one.