Inheritance diagram for bigram.Bigram.BigramLanguageModel:

Public Member Functions
	__init__ (self, vocab_size)
	constructor for BigramLanguageModel that calls super, and then creates a new embedding table of vocab_size by vocab_size

	forward (self, idx, targets=None)
	goes "ahead" by getting our next set of logits and then returns the logits and loss

	generate (self, idx, max_new_tokens)
	this is the generator of the bigram model, where the model creates tokens

Public Attributes
	token_embedding_table = nn.Embedding(vocab_size, n_embd)

	position_embedding_table = nn.Embedding(block_size, n_embd)

	lm_head = nn.Linear(n_embd, vocab_size)

Detailed Description

Definition at line 152 of file Bigram.py.

Constructor & Destructor Documentation

◆ init()

bigram.Bigram.BigramLanguageModel.__init__	(		self,
			vocab_size )

constructor for BigramLanguageModel that calls super, and then creates a new embedding table of vocab_size by vocab_size

Parameters

self	the current BigramLanguageModel object
vocab_size	the length of the vocab_size

Returns: new BigramLanguageModel

Definition at line 154 of file Bigram.py.

    def __init__(self, vocab_size):
        """
        @brief constructor for BigramLanguageModel that calls super, and then
        creates a new embedding table of vocab_size by vocab_size
 
        @param self: the current BigramLanguageModel object
        @param vocab_size: the length of the vocab_size
        @return new BigramLanguageModel
        """
        super().__init__()
        # each token directly reads off the logits for hte next token from a tookup table
        self.token_embedding_table = nn.Embedding(vocab_size, n_embd)
        self.position_embedding_table = nn.Embedding(block_size, n_embd)
        self.lm_head = nn.Linear(n_embd, vocab_size)
 

References __init__().

Referenced by __init__().

Member Function Documentation

◆ forward()

bigram.Bigram.BigramLanguageModel.forward	(	self,
		idx,
		targets = None )

goes "ahead" by getting our next set of logits and then returns the logits and loss

Parameters

self	the current BigramLanguageModel object
idx	the current context batch
targets	the tokens that are actually next

Returns: logits, loss: Tuple(logits, loss), which are the logits, and the loss under cross_entropy

Definition at line 169 of file Bigram.py.

    def forward(self, idx, targets=None):
        """
        @brief goes "ahead" by getting our next set of logits and then returns
        the logits and loss
 
        @param self: the current BigramLanguageModel object
        @param idx: the current context batch
        @param targets: the tokens that are actually next
        @return logits, loss: Tuple(logits, loss), which are the logits, and
        the loss under cross_entropy
        """
 
        # idx and targets are both (B, T) tensor of integers
        tok_emb = self.token_embedding_table(idx)  # (B, T, C)
        logits = self.lm_head(tok_emb)  # (B, T, vocab_size)
 
        # negative log likelyhood loss (cross-entropy) how close are logits
        # to target
        if targets == None:
            loss = None
        else:
            B, T, C = logits.shape
            logits = logits.view(B * T, C)
            targets = targets.view(B * T)
            loss = F.cross_entropy(logits, targets)
 
        return logits, loss
 

References lm_head, and token_embedding_table.

◆ generate()

bigram.Bigram.BigramLanguageModel.generate	(	self,
		idx,
		max_new_tokens )

this is the generator of the bigram model, where the model creates tokens

Idx is the current context of characters in the current batch. And we expent all the B dimension in the T dimension. What is predicted along the T dimension. We take the current idx, focus on the last element in the T dimension. These are then converted to probabilities using softmax, and the sampled using multinomial. We then take the sampled integers, and concat them

Parameters

self	the current BigramLanguageModel object
idx	the current context batch (which is dim(B by T))
max_new_tokens	the amount of new tokens we want to generate

Returns: idx: modified idx with the new generated tokens appended

Definition at line 197 of file Bigram.py.

    def generate(self, idx, max_new_tokens):
        """
        @brief this is the generator of the bigram model, where the model creates tokens
 
        Idx is the current context of characters in the current batch. And we
        expent all the B dimension in the T dimension. What is predicted along
        the T dimension.
        We take the current idx, focus on the last element in the T dimension.
        These are then converted to probabilities using softmax, and the sampled
        using multinomial. We then take the sampled integers, and concat them
 
        @param self: the current BigramLanguageModel object
        @param idx: the current context batch (which is dim(B by T))
        @param max_new_tokens: the amount of new tokens we want to generate
        @return idx: modified idx with the new generated tokens appended
        """
        # idx is (B, T) array of incixes in the current context
        for _ in range(max_new_tokens):
            # get the predictions
            logits, loss = self(idx)
            # focus only on the last time step
            logits = logits[:, -1, :]  # now dim(B by C)
            # apply the softmax to get probabilities
            probs = F.softmax(logits, dim=-1)  # dim(B by C)
            # sample from the distribution
            idx_next = torch.multinomial(probs, num_samples=1)  # (B, 1)
            # append sampled index to the running sequence
            idx = torch.cat((idx, idx_next), dim=1)  # dim(B by T-1)
        return idx
 
 

Member Data Documentation

◆ lm_head

bigram.Bigram.BigramLanguageModel.lm_head = nn.Linear(n_embd, vocab_size)

Definition at line 167 of file Bigram.py.

Referenced by forward(), and working_gpt.GPTLanguageModel.forward().

◆ position_embedding_table

bigram.Bigram.BigramLanguageModel.position_embedding_table = nn.Embedding(block_size, n_embd)

Definition at line 166 of file Bigram.py.

Referenced by working_gpt.GPTLanguageModel.forward().

◆ token_embedding_table

bigram.Bigram.BigramLanguageModel.token_embedding_table = nn.Embedding(vocab_size, n_embd)

Definition at line 165 of file Bigram.py.

Referenced by forward(), and working_gpt.GPTLanguageModel.forward().

The documentation for this class was generated from the following file:

bigram/Bigram.py

Public Member Functions

Public Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ forward()

◆ generate()

Member Data Documentation

◆ lm_head

◆ position_embedding_table

◆ token_embedding_table

◆ init()