One article teaches you to understand the working principle of the GPT model
AIAIAIChatGPTAIChatGPTGPTGPTOpenAIGPTGenerative Pre-trained Transformer2021GPTAzure OpenAIGPT-3GPT-3GPT-3few-shotGPT-3.5ChatGPTGPT-4GPTGPTnOpenAIGPTWeanthropomorphizingChatGPTOpenAITokenizer[1]GPT-3CodexGPT-3OpenAItiktoken[2]PythonOpenAIdavinciGPT-3UIimport tiktoken# davinci GPT3r50k_baseencoding = tiktoken
AIAIAIChatGPTAIChatGPTGPT
GPT
OpenAIGPTGenerative Pre-trained Transformer
2021GPTAzure OpenAIGPT-3GPT-3GPT-3few-shotGPT-3.5ChatGPTGPT-4GPTGPT
n
OpenAIGPTWeanthropomorphizingChatGPTOpenAITokenizer[1]GPT-3CodexGPT-3
OpenAItiktoken[2]PythonOpenAIdavinciGPT-3UI
import tiktoken# davinci GPT3r50k_baseencoding = tiktoken.encoding_for_model("davinci")text = "We need to stop anthropomorphizing ChatGPT." print(f"text: {text}")token_integers = encoding.encode(text) print(f"total number of tokens: {encoding.n_vocab}")print(f"token integers: {token_integers}") token_strings = [encoding.decode_single_token_bytes(token) for token in token_integers] print(f"token strings: {token_strings}") print(f"number of tokens in text: {len(token_integers)}")encoded_decoded_text = encoding.decode(token_integers) print(f"encoded-decoded text: {encoded_decoded_text}")
We need to stop anthropomorphizing ChatGPT.
50257 [1135, 761, 284, 2245, 17911, 25831, 2890, 24101, 38, 11571, 13]
[b'We', b' need', b' to', b' stop', b' anthrop', b'omorph', b'izing', b' Chat', b'G', b'PT', b'.']
11
-ChatGPT
50257
OpenAIOpenAI nearly 11We need to11OpenAI
OpenAI
OpenAI Google Sentence Piece[3]
n
OpenAI ChatGPT[4]n
"We need to"
ChatGPT
GPT
Hidden Markov ModelsHMM2070nn=1
The quick brown fox jumps over thelazyHMMtheHMM
N-gram2090HMMn-gramlazy
n-grambi-gramnn x ncarcaar
ccn-gramn
N-gram
2000Recurrent Neural NetworksRNNsLSTMGRURNN
RNNRNNWeneedtoRNN
LSTMGRURNNmemory cell
RNNLSTMGRU
2017GoogleTransformers[5]TransformersRNNGPUTransformersOpenAIGPT
Transformersattention mechanism
boughtwentwentwentwentand
GPT2017masked multi-head attention
AttentionTransformer
MaskedGPT
Multi-headTransformer
LSTMGRUTransformer
TransformerGPTGPT
GPT
OpenAIGPT-3.5ChatGPTGPT-4TransformerGPTTransformer
GPT-3.5Transformer
ChatGPTGPT-3.5TransformerRLHFOpenAI2022 InstructGPTOpenAI
GPT-4RLHF
GPT
GPTOpenAI APIAzureOpenAI APIAPIOpenAIAPI
Azure
AIAPI
Azure
API
Azure OpenAI
GPT-3.5: text-davinci-002text-davinci-003
ChatGPT: gpt-35-turbo
GPT-4: gpt-4gpt-4-32k
GPT-4gpt-48,000gpt-4-32k32,000GPT-3.54,000
GPT-4[6]
OpenAIGPT
nTransformerOpenAITransformerGPT
GPT
https://towardsdatascience.com/how-gpt-models-work-b5f4517d5b5
References
[1] Tokenizer: https://platform.openai.com/tokenizer
[2] tiktoken: https://github.com/openai/tiktoken
[3] Sentence Piece: https://github.com/google/sentencepiece
[4] OpenAI ChatGPT: https://chat.openai.com/
[5] : https://arxiv.org/abs/1706.03762
[6] : https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models
Disclaimer: The content of this article is sourced from the internet. The copyright of the text, images, and other materials belongs to the original author. The platform reprints the materials for the purpose of conveying more information. The content of the article is for reference and learning only, and should not be used for commercial purposes. If it infringes on your legitimate rights and interests, please contact us promptly and we will handle it as soon as possible! We respect copyright and are committed to protecting it. Thank you for sharing.(Email:[email protected])