Public Member Functions | |
| __init__ (self, vocab_size) | |
| constructor for BigramLanguageModel that calls super, and then creates a new embedding table of vocab_size by vocab_size | |
| forward (self, idx, targets=None) | |
| goes "ahead" by getting our next set of logits and then returns the logits and loss | |
| generate (self, idx, max_new_tokens) | |
| this is the generator of the bigram model, where the model creates tokens | |
Public Attributes | |
| token_embedding_table = nn.Embedding(vocab_size, n_embd) | |
| position_embedding_table = nn.Embedding(block_size, n_embd) | |
| lm_head = nn.Linear(n_embd, vocab_size) | |
| bigram.Bigram.BigramLanguageModel.__init__ | ( | self, | |
| vocab_size ) |
constructor for BigramLanguageModel that calls super, and then creates a new embedding table of vocab_size by vocab_size
| self | the current BigramLanguageModel object |
| vocab_size | the length of the vocab_size |
Definition at line 154 of file Bigram.py.
References __init__().
Referenced by __init__().
| bigram.Bigram.BigramLanguageModel.forward | ( | self, | |
| idx, | |||
| targets = None ) |
goes "ahead" by getting our next set of logits and then returns the logits and loss
| self | the current BigramLanguageModel object |
| idx | the current context batch |
| targets | the tokens that are actually next |
Definition at line 169 of file Bigram.py.
References lm_head, and token_embedding_table.
| bigram.Bigram.BigramLanguageModel.generate | ( | self, | |
| idx, | |||
| max_new_tokens ) |
this is the generator of the bigram model, where the model creates tokens
Idx is the current context of characters in the current batch. And we expent all the B dimension in the T dimension. What is predicted along the T dimension. We take the current idx, focus on the last element in the T dimension. These are then converted to probabilities using softmax, and the sampled using multinomial. We then take the sampled integers, and concat them
| self | the current BigramLanguageModel object |
| idx | the current context batch (which is dim(B by T)) |
| max_new_tokens | the amount of new tokens we want to generate |
Definition at line 197 of file Bigram.py.
| bigram.Bigram.BigramLanguageModel.lm_head = nn.Linear(n_embd, vocab_size) |
Definition at line 167 of file Bigram.py.
Referenced by forward(), and working_gpt.GPTLanguageModel.forward().
| bigram.Bigram.BigramLanguageModel.position_embedding_table = nn.Embedding(block_size, n_embd) |
Definition at line 166 of file Bigram.py.
Referenced by working_gpt.GPTLanguageModel.forward().
| bigram.Bigram.BigramLanguageModel.token_embedding_table = nn.Embedding(vocab_size, n_embd) |
Definition at line 165 of file Bigram.py.
Referenced by forward(), and working_gpt.GPTLanguageModel.forward().