One article teaches you to understand the working principle of the GPT model

Cloud Tech 2023-07-17 07:09:49 Source: Network

AIAIAIChatGPTAIChatGPTGPTGPTOpenAIGPTGenerative Pre-trained Transformer2021GPTAzure OpenAIGPT-3GPT-3GPT-3few-shotGPT-3.5ChatGPTGPT-4GPTGPTnOpenAIGPTWeanthropomorphizingChatGPTOpenAITokenizer[1]GPT-3CodexGPT-3OpenAItiktoken[2]PythonOpenAIdavinciGPT-3UIimport tiktoken# davinci GPT3r50k_baseencoding = tiktoken

AIAIAIChatGPTAIChatGPTGPT

GPT

OpenAIGPTGenerative Pre-trained Transformer

2021GPTAzure OpenAIGPT-3GPT-3GPT-3few-shotGPT-3.5ChatGPTGPT-4GPTGPT

n

OpenAIGPTWeanthropomorphizingChatGPTOpenAITokenizer[1]GPT-3CodexGPT-3

OpenAItiktoken[2]PythonOpenAIdavinciGPT-3UI

import tiktoken# davinci GPT3r50k_baseencoding = tiktoken.encoding_for_model("davinci")text = "We need to stop anthropomorphizing ChatGPT."  print(f"text: {text}")token_integers = encoding.encode(text)  print(f"total number of tokens: {encoding.n_vocab}")print(f"token integers: {token_integers}")  token_strings = [encoding.decode_single_token_bytes(token) for token in token_integers]  print(f"token strings: {token_strings}")  print(f"number of tokens in text: {len(token_integers)}")encoded_decoded_text = encoding.decode(token_integers)  print(f"encoded-decoded text: {encoded_decoded_text}")

We need to stop anthropomorphizing ChatGPT.

50257 [1135, 761, 284, 2245, 17911, 25831, 2890, 24101, 38, 11571, 13]

[b'We', b' need', b' to', b' stop', b' anthrop', b'omorph', b'izing', b' Chat', b'G', b'PT', b'.']

11

-ChatGPT

50257

OpenAIOpenAI nearly 11We need to11OpenAI

OpenAI

OpenAI Google Sentence Piece[3]

n

OpenAI ChatGPT[4]n

"We need to"

ChatGPT

GPT

Hidden Markov ModelsHMM2070nn=1

The quick brown fox jumps over thelazyHMMtheHMM

N-gram2090HMMn-gramlazy

n-grambi-gramnn x ncarcaar

ccn-gramn

N-gram

2000Recurrent Neural NetworksRNNsLSTMGRURNN

RNNRNNWeneedtoRNN

LSTMGRURNNmemory cell

RNNLSTMGRU

2017GoogleTransformers[5]TransformersRNNGPUTransformersOpenAIGPT

Transformersattention mechanism

boughtwentwentwentwentand

GPT2017masked multi-head attention

AttentionTransformer

MaskedGPT

Multi-headTransformer

LSTMGRUTransformer

TransformerGPTGPT

GPT

OpenAIGPT-3.5ChatGPTGPT-4TransformerGPTTransformer

GPT-3.5Transformer

ChatGPTGPT-3.5TransformerRLHFOpenAI2022 InstructGPTOpenAI

GPT-4RLHF

GPT

GPTOpenAI APIAzureOpenAI APIAPIOpenAIAPI

Azure

AIAPI

Azure

API

Azure OpenAI

GPT-3.5: text-davinci-002text-davinci-003

ChatGPT: gpt-35-turbo

GPT-4: gpt-4gpt-4-32k

GPT-4gpt-48,000gpt-4-32k32,000GPT-3.54,000

GPT-4[6]

OpenAIGPT

nTransformerOpenAITransformerGPT

GPT

https://towardsdatascience.com/how-gpt-models-work-b5f4517d5b5

References

[1] Tokenizer: https://platform.openai.com/tokenizer
[2] tiktoken: https://github.com/openai/tiktoken
[3] Sentence Piece: https://github.com/google/sentencepiece
[4] OpenAI ChatGPT: https://chat.openai.com/
[5] : https://arxiv.org/abs/1706.03762
[6] : https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models

Disclaimer: The content of this article is sourced from the internet. The copyright of the text, images, and other materials belongs to the original author. The platform reprints the materials for the purpose of conveying more information. The content of the article is for reference and learning only, and should not be used for commercial purposes. If it infringes on your legitimate rights and interests, please contact us promptly and we will handle it as soon as possible! We respect copyright and are committed to protecting it. Thank you for sharing.(Email:[email protected])