Main Content

standardizeMissing

Insert standard missing values

Description

example

B= standardizeMissing(A,indicator)replaces values specified inindicatorwith standard missing values inA并返回一个站ardized array or table.

Missing values are defined according to the data type ofA:

  • NaNdouble,single,duration, andcalendarDuration

  • NaTdatetime

  • string

  • categorical

  • ' 'char

  • {''}cellof character vectors

IfAis a table, then the data type of each column defines the missing value for that column.

example

B= standardizeMissing(___,Name,Value)specifies additional parameters for standardizing missing values using one or more name-value arguments. For example,standardizeMissing(A,indicator,'DataVariables',datavars)standardizes missing values in the variables specified bydatavarswhenAis a table or timetable.

Examples

collapse all

Create a row vector and replace all instances of-99年with the standard missing value fordoubledata types,NaN.

A = [0 1 5 -99 8 3 4 -99 16]; B = standardizeMissing(A,-99)
B =1×90 1 5 NaN 8 3 4 NaN 16

Create a table containingInfand'N/A'to represent missing values.

dblVar = [NaN;3;Inf;7;9]; cellstrVar = {'one';'three';'';'N/A';'nine'}; charVar = ['A';'C';'E';' ';'I']; categoryVar = categorical({“红色”;'yellow';'blue';'violet';''}); A = table(dblVar,cellstrVar,charVar,categoryVar)
A=5×4 tabledblVar cellstrVar charVar categoryVar ______ __________ _______ ___________ NaN {'one' } A red 3 {'three' } C yellow Inf {0x0 char} E blue 7 {'N/A' } violet 9 {'nine' } I 

Replace all instances ofInfwithNaNand replace all instances of'N/A'with the empty character vector,''.

B = standardizeMissing(A,{Inf,'N/A'})
B=5×4 tabledblVar cellstrVar charVar categoryVar ______ __________ _______ ___________ NaN {'one' } A red 3 {'three' } C yellow NaN {0x0 char} E blue 7 {0x0 char} violet 9 {'nine' } I 

Replace instances ofInfand'N/A'occurring in specified variables of a table with the standard missing value indicators.

Create a table containingInfand'N/A'to represent missing values.

a = {'alpha';'bravo';'charlie';'';'N/A'}; x = [1;NaN;3;Inf;5]; y = [57;732;93;1398;Inf]; A = table(a,x,y)
A=5×3 tablea x y ___________ ___ ____ {'alpha' } 1 57 {'bravo' } NaN 732 {'charlie'} 3 93 {0x0 char } Inf 1398 {'N/A' } 5 Inf

For the variablesaandx, replace instances ofInfwithNaNand'N/A'with the empty character vector,''.

B = standardizeMissing(A,{Inf,'N/A'},'DataVariables',{'a','x'})
B=5×3 tablea x y ___________ ___ ____ {'alpha' } 1 57 {'bravo' } NaN 732 {'charlie'} 3 93 {0x0 char } NaN 1398 {0x0 char } 5 Inf

Infin the variableyremains unchanged becauseyis not included in theDataVariablesname-value argument.

Input Arguments

collapse all

Input data, specified as a vector, matrix, multidimensional array, table, or timetable. IfAis a timetable, thenstandardizeMissingoperates on the table data only and ignoresNaTandNaNvalues in the vector of row times.

Data Types:double|single|char|string|cell|table|timetable|categorical|datetime|duration

Nonstandard missing value indicator, specified as a scalar, vector, or cell array. The elements ofindicatordefine the values thatstandardizeMissingtreats as missing. IfAis an array, thenindicatormust be a vector. IfAis a table or timetable, thenindicatorcan also be a cell array with entries of multiple data types.

The data types specified inindicatormatch data types in the corresponding entries ofA. The following are additional data type matches between the elements ofindicatorand elements ofA:

  • doubleindicators matchdouble,single, integer, andlogicalentries ofA.

  • stringandcharindicators matchcategoricalentries ofA.

Example:B = standardizeMissing(A,'N/A')replaces the character vector'N/A'with the empty character vector,''.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64|logical|char|string|cell|datetime|duration

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, whereNameis the argument name andValueis the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and encloseNamein quotes.

Example:standardizeMissing(T,indicator,'ReplaceValues',false)

Table variables to operate on, specified as one of the options in this table. TheDataVariablesvalue indicates which variables of the input table to fill.

没有指定表中的其他变量DataVariablespass through to the output without being standardized.

Option Description Examples
Variable name

A character vector or scalar string specifying a single table variable name

'Var1'

"Var1"

Vector of variable names

A cell array of character vectors or string array where each element is a table variable name

{'Var1' 'Var2'}

["Var1" "Var2"]

Scalar or vector of variable indices

A scalar or vector of table variable indices

1

[1 3 5]

Logical vector

A logical vector whose elements each correspond to a table variable, wheretrueincludes the corresponding variable andfalseexcludes it

[true false true]

Function handle

A function handle that takes a table variable as input and returns a logical scalar

@isnumeric

vartypesubscript

A table subscript generated by thevartypefunction

vartype('numeric')

Example:standardizeMissing(T,indicator,'DataVariables',["Var1" "Var2" "Var4"])

Replace values indicator, specified as one of these values whenAis a table or timetable:

  • trueor1— Replace input table variables with table variables containing standardized data.

  • falseor0— Append input table variables with table variables containing standardized data.

For vector, matrix, or multidimensional array input data,ReplaceValuesis not supported.

Bis the same size asAunless the value ofReplaceValuesisfalse. If the value ofReplaceValuesisfalse, then the width ofBis the sum of the input data width and the number of data variables specified.

Example:standardizeMissing(T,indicator,'ReplaceValues',false)

Algorithms

standardizeMissingtreats leading and trailing white space differently for cell arrays of character vectors, character arrays, and categorical arrays.

  • For cell arrays of character vectors,standardizeMissingdoes not ignore white space. All character vectors must match exactly a character vector specified inindicator.

  • For character arrays,standardizeMissingignores trailing white space.

  • For categorical arrays,standardizeMissingignores leading and trailing white space.

Extended Capabilities

版本历史

Introduced in R2013b

expand all