数据样本
从数据中随机采样,有或没有替换
句法
y= datasampe(data,k)
y= datasampe(data,k,dim)
[Y,IDX] = DataSample(数据,K,...)
[y,...] =数据样本(s,data,k,...)
[[y,...] =数据样本(data,k,Name,Value)
[[y,...] =数据样本(data,k,dim,Name,Value)
Description
返回y
= datasampe(data
,,,,k
)k
observations sampled uniformly at random, with replacement, from the data indata
。
返回沿维度采集的样本y
= datasampe(data
,,,,k
,,,,暗淡
)暗淡
ofdata
。
[[
返回索引值向量表示y
,,,,IDX
] = datasample(data
,,,,k
,,,,。。。)数据样本
从data
。
[[
使用随机数流y
,...] =数据样本(s
,,,,data
,,,,k
,,,,。。。)s
生成随机数。
[[
ory
,...] =数据样本(data
,,,,k
,,,,名称,价值
)[[
samples with additional options specified by one or morey
,...] =数据样本(data
,,,,k
,,,,暗淡
,,,,名称,价值
)名称,价值
pair arguments.
输入参数
|
Vector, matrix,n- 维数阵列,table, or dataset array representing the data from which to sample. By default, |
|
正整数,样品数量。 |
|
整数指定采样的尺寸。例如,如果 默认: |
|
随机数流。创造 默认:全局随机数流 |
姓名-Value Pair Arguments
指定可选的逗号分隔对名称,价值
arguments.姓名
是参数名称和Value
是the corresponding value.姓名
must appear inside single quotes (''
)。您可以按任何顺序指定几个名称和值对参数姓名1,,,,Value1,...,NameN,ValueN
。
|
选择替换的样品 默认: |
|
与数据元素相同的元素数量的向量 默认: |
输出参数
|
When the sample is taken with replacement (default), |
|
指示哪些元素的索引向量
|
Examples
Draw five unique values from the integers1:10
。
y= datasampe(1:10,,,,5,'Replace',false) y = 6 3 7 8 5
Generate a random sequence of the charactersACGT
,,,,with replacement, according to specified probabilities.
seq = datasample('ACGT',48,'Weights',[0.15 0.35 0.35 0.15]) seq = CTTCGACTGTGAGTGGGCGCGACAAGGCTACCGGCCCGGGCGGCACTC
Select a random subset of columns from a data matrix.
x = randn(10,1000);Y = datasample(X,5,2,'Replace',false) Y = 0.7007 0.3382 2.1298 -0.1891 0.5026 0.6520 -0.6693 -0.1961 -0.9915 1.9107 0.1785 0.6640 2.3247 -1.1735 -1.0020 1.6760 2.6102 -0.8902 -0.7735 1.8676 -0.3251 -0.6415-0.2572 -0.1629 -1.0523 0.1011 0.9323 -1.3088 -0.4477 0.8036 -0.5767 -0.5778 -0.8556 0.8672 -0.0727 -0.0615 -0.9084 0.9020 -0.4185 -1.9520 0.7256 -1.1228 0.7558 1.2691 2.4997 -1.2273 0.5754 -0.8755 -0.8224 -1.2066
Resample observations from a dataset array to create a bootstrap replicate dataset.
负载医院y =数据样本(医院,大小(医院,1));
使用第二个输出从两个数据向量进行“并行”采样。
x1 = randn(100,1);x2 = randn(100,1);[Y1,IDX] = DataSample(x1,10);y2 = x2(idx);
备择方案
You can use兰迪
or兰德珀
to generate indices for random sampling with or without replacement, respectively. However,数据样本
can be more convenient because it samples directly from your data.数据样本
also allows weighted sampling.
References
[[1] Wong, C. K. and M. C. Easton.An Efficient Method for Weighted Sampling Without Replacement.SIAM计算杂志9(1),第111–113页,1980年。