Main Content

Language Support

Information on language support in Text Analytics Toolbox™

Text Analytics Toolbox supports the languages English, Japanese, German, and Korean. Most Text Analytics Toolbox functions also work with text in other languages. For more information, seeLanguage Considerations.

Functions

expand all

tokenizedDocument Array of tokenized documents for text analysis
removeStopWords Remove stop words from documents
normalizeWords Stem or lemmatize words
stopWords List of stop words
mecabOptions Options for MeCab tokenization
tokenDetails Details of tokens in tokenized document array
addSentenceDetails Add sentence numbers to documents
addPartOfSpeechDetails Add part-of-speech tags to documents
addEntityDetails Add entity tags to documents
addLemmaDetails Add lemma forms of tokens to documents
addLanguageDetails Add language identifiers to documents
corpusLanguage Detect language of text

Topics

English Language

Text Data Preparation

Import text data into MATLAB®and preprocess it for analysis

Modeling and Prediction

Develop predictive models using topic models and word embeddings

说play and Presentation

Visualize text data and models using word clouds and text scatter plots

Japanese Language

Japanese Language Support

Information on Japanese support in Text Analytics Toolbox.

Analyze Japanese Text Data

This example shows how to import, prepare, and analyze Japanese text data using a topic model.

German Language

German Language Support

Information on German support in Text Analytics Toolbox.

Analyze German Text Data

This example shows how to import, prepare, and analyze German text data using a topic model.

Korean Language

Korean Language Support

Information on Korean support in Text Analytics Toolbox.

Other Languages

Language Considerations

Information on using Text Analytics Toolbox features for other languages.

Language-Independent Features

Text Analytics Toolbox features that do not depend on language details.

Featured Examples