Main Content

isVocabularyWord

Test if word is member of word embedding or encoding

Description

example

tf= isVocabularyWord(emb,words)tests if the elements ofwordsare members of the word embeddingemb. The function returns a logical array containing1(true) where the words are members of the word embedding. Elsewhere, the array contains0(false). The function, by default, is case sensitive.

tf= isVocabularyWord(enc,words)tests if the elements ofwordsare members of the word encodingenc. The function, by default, is case sensitive.

tf= isVocabularyWord(___,'IgnoreCase',true)tests if the specified words are in the vocabulary ignoring case using any of the previous syntaxes.

Examples

collapse all

Test to determine if words are members of a word embedding.

Load a pretrained word embedding using thefastTextWordEmbeddingfunction. This function requires Text Analytics Toolbox™ Modelfor fastText English 16 Billion Token Word Embeddingsupport package. If this support package is not installed, then the function provides a download link.

emb = fastTextWordEmbedding
emb = wordEmbedding with properties: Dimension: 300 Vocabulary: [1×999994 string]

Test if the words"I","love", and"fastTextWordEmbedding"are in the word embedding.

words = ["I""love""fastTextWordEmbedding"]; tf = isVocabularyWord(emb,words)
tf =1×3 logical array1 1 0

Input Arguments

collapse all

Input word embedding, specified as awordEmbeddingobject.

Input word encoding, specified as awordEncodingobject.

Input words, specified as a string vector, character vector, or cell array of character vectors. If you specifywordsas a character vector, then the function treats the argument as a single word.

Data Types:string|char|cell

Introduced in R2018b