Data analytics with human language data

Natural language processing (NLP) refers to the broad class of computational techniques for incorporatingspeechandtext data, along with other types of engineering data, into the development of smart systems.

Raw human language data can come from a variety of sources, including audio signals, web and social media, documents and databases containing valuable information such as voice commands, public sentiment on topics, operational data, and maintenance reports. Natural language processing can be used to combine and simplify these large sources of data, transforming them into meaningful insight withvisualizations,topic models, andmachine learning classifiers. For example, usingMATLAB®you can detect the presence of human speech in an audio segment, performspeech-to-texttranscription, and then perform text mining and machine learning on those sources.

Natural language processing is used in finance, manufacturing, electronics, software, information technology, and other industries for applications such as:

  • Automating the classification of reviews based on sentiment, whether positive or negative
  • Counting the frequency of words or phrases in documents and performing topic modeling
  • Developing predictive equipment maintenance schedules based on sensor and text log data
  • Automating labeling and tagging of speech recordings

To learn more about deriving understanding from speech and text data using natural language processing, seeText Analytics Toolbox™,Audio Toolbox™, andStatistics and Machine Learning Toolbox™.

See also:data science,machine learning,deep learning,sentiment analysis,text mining,long short-term memory (LSTM) networks