newBag= removeInfrequentNgrams(英航g,count)removes the n-grams that appear at mostcounttimes in total from the bag-of-n-grams model英航g. The function, by default, is case sensitive.
newBag= removeInfrequentNgrams(英航g,count,'NgramLengths',lengths)only removes n-grams with lengths specified bylengths. The function, by default, is case sensitive.
newBag= removeInfrequentNgrams(___,'IgnoreCase',true)removes the n-grams that appear at mostcounttimes ignoring case. If n-grams differ only by case, then the corresponding counts are merged.
Load the example data. The filesonnetsPreprocessed.txtcontains preprocessed versions of Shakespeare's sonnets. The file contains one sonnet per line, with words separated by a space. Extract the text fromsonnetsPreprocessed.txt, split the text into documents at newline characters, and then tokenize the documents.
Input bag-of-n-grams model, specified as a英航gOfNgramsobject.
count—计算阈值 positive integer
计算阈值, specified as a positive integer. The function removes the n-grams that appearcounttimes in total or fewer.
lengths—N-gram lengths positive integer|vector of positive integers
N-gram lengths, specified as a positive integer or a vector of positive integers.
如果您指定lengths, the function removes infrequent n-grams of the specified lengths only. If you do not specifylengths, then the function removes infrequent n-grams regardless of length.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.