Documentation

Big Data Processing

Analyze big data sets in parallel using distributed arrays, tall arrays, datastores, ormapreduce, on Spark®and Hadoop®clusters

You can use Parallel Computing Toolbox™ to distribute large arrays in parallel across multiple MATLAB® workers, so that you can run big-data applications that use the combined memory of your cluster. Parallel Computing Toolbox also enables you to execute MATLAB®tall array and数据存储calculations in parallel, so that you can analyze big data sets that do not fit in the memory of your cluster. You can useMATLAB Distributed Computing Server™to run tall array and数据存储calculations in parallel on Spark enabled Hadoop clusters. Doing so significantly reduces the execution time of very large data calculations.

  • Distributed Arrays
    Analyze big data sets in parallel using distributed arrays and simultaneous execution.
  • Tall Arrays and Mapreduce
    Analyze big data sets in parallel using MATLAB tall arrays and datastores ormapreduceon Spark and Hadoop clusters, and parallel pools.
Was this topic helpful?