主要内容

Upload Deep Learning Data to the Cloud

This example shows how to upload data to an Amazon S3 bucket.

Before you can perform deep learning training in the cloud, you need to upload your data to the cloud. The example shows how to download the CIFAR-10 data set to your computer, and then upload the data to an Amazon S3 bucket for later use in MATLAB. The CIFAR-10 data set is a labeled image data set commonly used for benchmarking image classification algorithms. Before running this example, you need access to an Amazon Web Services (AWS) account. After you upload the data set to Amazon S3, you can try any of the examples inDeep Learning in Parallel and in the Cloud.

Download CIFAR-10 to Local Machine

Specify a local directory in which to download the data set. The following code creates a folder in your current directory containing all the images in the data set.

directory = pwd; [trainDirectory,testDirectory] = downloadCIFARToFolders(directory);
Downloading CIFAR-10 data set...done. Copying CIFAR-10 to folders...done.

Upload Local Data Set to Amazon S3 Bucket

To work with data in the cloud, you can upload to Amazon S3 and then use datastores to access the data in S3 from the workers in your cluster. The following steps describe how to upload the CIFAR-10 data set from your local machine to an Amazon S3 bucket.

1. For efficient file transfers to and from Amazon S3, download and install the AWS Command Line Interface tool fromhttps://aws.amazon.com/cli/.

2. Specify your AWS Access Key ID, Secret Access Key, and Region of the bucket as system environment variables. Contact your AWS account administrator to obtain your keys.

例如,在Linux,MacOS或UNIX上,指定这些变量:

export AWS_ACCESS_KEY_ID="YOUR_AWS_ACCESS_KEY_ID" export AWS_SECRET_ACCESS_KEY="YOUR_AWS_SECRET_ACCESS_KEY" export AWS_DEFAULT_REGION="us-east-1"

On Windows, specify these variables:

设置AWS_ACCESS_KEY_ID =“ your_aws_access_key_id” set aws_secret_access_key =“ your_aws_secret_access_key” set aws_default_region =“ us-east-1”

要永久指定这些环境变量,请将它们设置在您的用户或系统环境中。

3. Create a bucket for your data by using either the AWS S3 web page or a command such as the following:

aws s3 mb s3://mynewbucket

4. Upload your data using a command such as the following:

AWS S3 CP mylocaldatapath s3:// mynewbucket-回报

例如:

aws s3 cp path/to/CIFAR10/in/the/local/machine s3://MyExampleCloudData/cifar10/ --recursive

5.通过在MATLAB中完成以下步骤:

一个。在里面环境section on theHometab, select平行>Create and Manage Clusters.

b。在里面集群配置文件集群配置文件管理器的窗格,选择您的云群集配置文件。

c. In thePropertiestab, select the环境Variables财产,scrolling as necessary to find the property.

d。在窗口的右下角,单击编辑.

e. Click in the box to the right of环境Variables, 和then type these three variables, each on its own line:AWS_ACCESS_KEY_ID,aws_secret_access_key, 和AWS_DEFAULT_REGION.

F。在窗口的右下角,单击完毕.

For information on how to create a cloud cluster, seeCreate Cloud Cluster(Parallel Computing Toolbox).

在MATLAB中使用数据集

After you store your data in Amazon S3, you can use datastores to access the data from your cluster workers. Simply create a datastore pointing to the URL of the S3 bucket. The following sample code shows how to use an成像to access an S3 bucket. Replace's3://MyExampleCloudData/cifar10/train'带有您的S3桶的URL。

imds = imageDatastore('s3://MyExampleCloudData/cifar10/train',...“包括橡皮folders”,true,...“ Labelsource”,'foldernames');

使用CIFAR-10数据集现在存储在Amazon S3中,您可以尝试使用任何示例Deep Learning in Parallel and in the Cloud这显示了如何在不同用例中使用CIFAR-10。

See Also

相关话题