Main Content

doclength

Length of documents in document array

Description

example

N= doclength(documents)returns the number of tokens in each document indocuments.

Examples

collapse all

Find the number of words in an array of tokenized documents. Erase the punctuation characters so they do not get counted as words.

str = [..."An example of a short sentence.""A second short sentence."]; documents = tokenizedDocument(str)
documents = 2x1 tokenizedDocument: 7 tokens: An example of a short sentence . 5 tokens: A second short sentence .
documents = erasePunctuation(documents)
documents = 2x1 tokenizedDocument: 6 tokens: An example of a short sentence 4 tokens: A second short sentence
N = doclength(documents)
N =2×16 4

Input Arguments

collapse all

Input documents, specified as atokenizedDocumentarray.

Output Arguments

collapse all

Document lengths, returned as a vector of nonnegative integers. The size ofNis the same as the size ofdocuments.

Introduced in R2017b