Types of AI learning models and main natural language AI use models

The following is a summary of the different methods of learning models used in AI and the types and characteristics of models used in major natural language-based AI.

Posted at: 2023.4.17

Types and characteristics of AI model learning

Types of modelsExplanationFields used
Supervised LearningModels that use human-labeled data to learn the relationship between inputs and outputs. When making predictions, the learned model is used to predict new input data.image recognition, speech recognition, natural language processing, classification, regression
Unsupervised LearningModels that use unlabeled data and can automatically find patterns and structures in the data, group the data, and extract features.Clustering, dimensionality reduction, anomaly detection
Reinforcement LearningA model that learns autonomously through interaction with its environment in order to achieve a certain goal. The optimal behavior is determined based on value judgments called rewards.Driving and other controls, robotics, game AI, financial market investment strategies
Semi-supervised learningModels that use both unlabeled and labeled data. By learning patterns from large amounts of unlabeled data and leveraging knowledge gained from small amounts of labeled data, prediction accuracy can be improved.Image recognition, speech recognition, natural language processing, recommendation systems
Transfer LearningA model that applies knowledge useful for solving one problem to another. By reusing knowledge gained in training for one task in training for a different task, high performance can be achieved for problems with limited training data.Natural language processing, image recognition, speech recognition, object detection, face recognition, data mining

The differences between these models are in the availability of training data, labels, and learning methods and objectives.

Major natural language AI learning models

BERT (unsupervised prior & transfer learning)

BERT is a model of transfer learning in natural language processing developed by Google. Many of the natural language AI models that have emerged in the past few years have taken the form of improvements on BERT, and in this sense, it can be called the "ancestor of natural language AI learning models.

BERT learns natural language processing tasks using large amounts of textual data automatically collected from a huge number of web pages, without using labeled data.

Because it is a natural language processing model developed by Google, it is also used in the company's search engine for the purpose of "capturing search intent.

XLNet (unsupervised learning)

XLNet, like BERT, is a natural language processing model developed by Google.

Unlike BERT, XLNet has the feature of "learning ordered sentences," and was called "the latest model of natural language processing that surpasses BERT" when it was first announced.

XLNet solves the weakness of BERT in that it cannot learn dependencies between masks by reordering words. This allows XLNet to produce more natural sentences and better prediction accuracy.

RoBERTa (unsupervised learning)

RoBERTa is an improved version of BERT developed by Facebook.

RoBERTa can achieve higher performance than BERT by improving on tokenization and preprocessing of training data, and by training on a larger amount of data than BERT.

In particular, it has been reported to show high accuracy in natural language inference and natural language generation tasks, and since it can be applied to different languages, it has also been used for multilingual text processing.

ALBERT (Supervised Prior Learning Model & Transition Learning)

ALBERT is an improved and lightweight version of BERT developed by Google, using EfficientNet design principles to streamline the BERT model to perform natural language processing tasks in a fast and efficient manner.

Conventional natural language processing models require a large number of GPUs to train large models, but ALBERT uses a unique mechanism called "parameter sharing" to achieve high accuracy even with smaller models.

ELECTRA (unsupervised learning)

ELECTRA is one of the latest large-scale language models in natural language processing proposed by OpenAI based on BERT.

It differs from GPT, also developed by OpenAI, in that ELECTRA learns by "recreating the original text from blindfolded text," while GPT learns by "predicting the next word from the previous word.

These differences also lead to differences in the purpose of use, with ELECTRA excelling at correctness of data and GPT excelling at predicting and generating sentences.

GPT (Unsupervised Learning)

GPT is a large-scale neural network model for natural language processing developed by OpenAI.

GPT learns the grammar and context of natural language by pre-training from a corpus of text on the Web and by understanding vast amounts of text. GPT can therefore perform natural language processing tasks such as text generation, translation, and class classification.

Its vast amount of learning also makes GPT a text-sensitive AI model, as in the case of Chat GPT, an interactive AI service using GPT, where GPT demonstrates high accuracy and human-like naturalness in situations where natural conversation and sentence generation are required.

New Posts