Main Content

Datastore

Read large collections of data

Thedatastorefunction creates a datastore, which is a repository for collections of data that are too large to fit in memory. A datastore allows you to read and process data stored in multiple files on a disk, a remote location, or a database as a single entity. If the data is too large to fit in memory, you can manage the incremental import of data, create atallarray to work with the data, or use the datastore as an input tomapreducefor further processing. For more information, seeGetting Started with Datastore.

Functions

expand all

datastore Create datastore for large collections of data
tabularTextDatastore Datastore for tabular text files
spreadsheetDatastore Datastore for spreadsheet files
imageDatastore 数据存储的图像数据
parquetDatastore Datastore for collection of Parquet files
fileDatastore Datastore with custom file reader
arrayDatastore Datastore for in-memory data
read Read data in datastore
readall Read all data in datastore
preview Preview subset of data in datastore
hasdata Determine if data is available to read
reset Reset datastore to initial state
writeall Write datastore to files
shuffle Shuffle all data in datastore
isShuffleable Determine whether datastore is shuffleable
numpartitions Number of datastore partitions
partition Partition a datastore
isPartitionable Determine whether datastore is partitionable

Functions

combine Combine data from multiple datastores
transform Transform datastore

Objects

CombinedDatastore Datastore to combine data read from multiple underlying datastores
TransformedDatastore Datastore to transform underlying datastore
KeyValueDatastore Datastore for key-value pair data for use withmapreduce
TallDatastore Datastore for checkpointingtallarrays

Classes

expand all

matlab.io.Datastore Base datastore class
matlab.io.datastore.Partitionable Add parallelization support to datastore
matlab.io.datastore.HadoopLocationBased AddHadoopsupport to datastore
matlab.io.datastore.Shuffleable Add shuffling support to datastore
matlab.io.datastore.DsFileSet File-set object for collection of files in datastore
matlab.io.datastore.DsFileReader File-reader object for files in a datastore
matlab.io.datastore.FileWritable Add file writing support to datastore
matlab.io.datastore.FoldersPropertyProvider 添加文件夹属性支持数据存储金宝app
matlab.io.datastore.FileSet File-set for collection of files in datastore
matlab.io.datastore.BlockedFileSet Blocked file-set for collection of blocks within file

Topics

Getting Started with Datastore

A datastore is an object for reading a single file or a collection of files or data.

Select Datastore for File Format or Application

Choose the right datastore based on the file format of your data or application.

Read and Analyze Large Tabular Text File

This example shows how to create a datastore for a large text file containing tabular data, and then read and process the data one block at a time or one file at a time.

Read and Analyze Image Files

This example shows how to create a datastore for a collection of images, read the image files, and find the images with the maximum average hue, saturation, and brightness (HSV).

Read and Analyze MAT-File with Key-Value Data

This example shows how to create a datastore for key-value pair data in a MAT-file that is the output ofmapreduce.

Read and Analyze Hadoop Sequence File

This example shows how to create a datastore for a Sequence file containing key-value data.

Work with Remote Data

Work with remote data in Amazon S3™, Microsoft®Azure®Storage Blob, or HDFS™.

Set Up Datastore for Processing on Different Machines or Clusters

Setup a datastore on your machine that can be loaded and processed on another machine or cluster.

Develop Custom Datastore

Create a fully customized datastore for your custom or proprietary data.

Develop Custom Datastore for DICOM Data

This example shows how to develop a custom datastore that supports writing operations.

Testing Guidelines for Custom Datastores

After implementing your custom datastore, follow this test procedure to qualify your custom datastore.