Transformer fundamentals
 
Loading...
Searching...
No Matches
tensor_prac.Tensor_Prac Namespace Reference

Functions

 main ()
 Main function that loads data and demonstrates basic tensor operations.
 
 print_block_size (block_size, train_data)
 Prints examples of context-target pairs for given block size.
 
 train_val_split (data)
 Splits data into training and validation sets.
 
 get_batch (batch_size, block_size, split, train_data, val_data)
 creates a training batch that can be stacked (better for gpus)
 
 print_training_batch (input, targs, batch_size, block_size)
 creates a training batch that can be stacked (better for gpus)
 

Function Documentation

◆ get_batch()

tensor_prac.Tensor_Prac.get_batch ( batch_size,
block_size,
split,
train_data,
val_data )

creates a training batch that can be stacked (better for gpus)

When training data we can create a batch of block sizes to train on which is better for GPUs as they want to process multiple calcuations in one input

Parameters
batch_sizethe size of the batch (how many independent sequences will we process in parallel)
block_sizethe size of the context (what is the maximum context length for predictions)
train_datathe Tensor that stores the data used to train
val_datathe Tensor that stores the data used to validate
Returns
Tuple(Tensor(batch_inputs), Tensor(batch_targets))

Definition at line 110 of file Tensor_Prac.py.

110def get_batch(batch_size, block_size, split, train_data, val_data):
111 """
112 @brief creates a training batch that can be stacked (better for gpus)
113
114 When training data we can create a batch of block sizes to train on which is better for
115 GPUs as they want to process multiple calcuations in one input
116
117 @param batch_size: the size of the batch (how many independent sequences will we process in parallel)
118 @param block_size: the size of the context (what is the maximum context length for predictions)
119 @param train_data: the Tensor that stores the data used to train
120 @param val_data: the Tensor that stores the data used to validate
121 @return Tuple(Tensor(batch_inputs), Tensor(batch_targets))
122
123 """
124 data = train_data if split == "train" else val_data
125 # randint will generate a random location for training
126 ix = torch.randint(len(data) - block_size, (batch_size,))
127 # stack can be used to append a tensor to another
128 x = torch.stack([data[i : i + block_size] for i in ix])
129 y = torch.stack([data[i + 1 : i + block_size + 1] for i in ix])
130 return x, y
131
132

Referenced by main().

◆ main()

tensor_prac.Tensor_Prac.main ( )

Main function that loads data and demonstrates basic tensor operations.

Opens a text file, processes the data, creates training/validation splits, and demonstrates block size operations.

Returns
None

Definition at line 17 of file Tensor_Prac.py.

17def main():
18 """
19 @brief Main function that loads data and demonstrates basic tensor operations.
20
21 Opens a text file, processes the data, creates training/validation splits,
22 and demonstrates block size operations.
23
24 @return None
25 """
26
27 # ensure that when data is randomly sampled we can reproduce that result
28 torch.manual_seed(1337)
29 with open(os.path.dirname(__file__) + "/../input.txt", "r", encoding="utf-8") as f:
30 text = f.read()
31
32 print("length of dataset in characters: ", len(text))
33
34 print(text[:1000])
35
36 chars = sorted(list(set(text)))
37 vocab_size = len(chars)
38 print("".join(chars))
39 print(vocab_size)
40
41 # create a mapping from characters to integers
42 stoi = {ch: i for i, ch in enumerate(chars)}
43 itos = {i: ch for i, ch in enumerate(chars)}
44 # encoder: takes a string, outputs a list of integers
45 encode = lambda s: [stoi[c] for c in s]
46 # decoder: take a list of strings, output the string
47 decode = lambda l: "".join([itos[i] for i in l])
48
49 print(encode("test message"))
50 print(decode(encode("test message")))
51
52 # create a tensor representation of the encoding for all the text
53 data = torch.tensor(encode(text), dtype=torch.long)
54 print(data.shape, data.dtype)
55 # print the first 1000 elements of tensor
56 print(data[:1000])
57
58 # get the train/val split data
59 train_data, val_data = train_val_split(data)
60 print_block_size(8, train_data)
61 input, targs = get_batch(4, 8, "train", train_data, val_data)
62 print_training_batch(input, targs, 4, 8)
63
64
Definition main.py:1

References get_batch(), print_block_size(), print_training_batch(), and train_val_split().

◆ print_block_size()

tensor_prac.Tensor_Prac.print_block_size ( block_size,
train_data )

Prints examples of context-target pairs for given block size.

Demonstrates how the context window works by showing input-output pairs for different context lengths.

This function will be used to print the block size for the data. Another way to describe block size is the amount of context that we will look at for each pass for our model. So in this case the block_size could be variable but for this we will fix it to 8 for this practice model

Parameters
block_sizethe size of context we want to take in
train_datathe training data we are using
Returns
None

Definition at line 65 of file Tensor_Prac.py.

65def print_block_size(block_size, train_data):
66 """
67 @brief Prints examples of context-target pairs for given block size.
68
69 Demonstrates how the context window works by showing input-output pairs
70 for different context lengths.
71
72 This function will be used to print the block size for the data.
73 Another way to describe block size is the amount of context that
74 we will look at for each pass for our model. So in this case the
75 block_size could be variable but for this we will fix it to 8 for
76 this practice model
77
78 @param block_size: the size of context we want to take in
79 @param train_data: the training data we are using
80 @return None
81 """
82 print(train_data[: block_size + 1])
83
84 # x will be used at training data and compared against y which is the
85 # actual next token
86 x = train_data[:block_size]
87 y = train_data[1 : block_size + 1]
88 for t in range(block_size):
89 context = x[: t + 1]
90 target = y[t]
91 print(f"when input is {context} the target: {target}")
92
93

Referenced by main().

◆ print_training_batch()

tensor_prac.Tensor_Prac.print_training_batch ( input,
targs,
batch_size,
block_size )

creates a training batch that can be stacked (better for gpus)

When training data we can create a batch of block sizes to train on which is better for GPUs as they want to process multiple calcuations in one input

Parameters
inputthe input batch tensor
targsthe output batch tensor
batch_sizethe batch_size of our input tensors
block_sizethe batch_size of our model
Returns
None

Definition at line 133 of file Tensor_Prac.py.

133def print_training_batch(input, targs, batch_size, block_size):
134 """
135 @brief creates a training batch that can be stacked (better for gpus)
136
137 When training data we can create a batch of block sizes to train on which is better for
138 GPUs as they want to process multiple calcuations in one input
139
140 @param input: the input batch tensor
141 @param targs: the output batch tensor
142 @param batch_size: the batch_size of our input tensors
143 @param block_size: the batch_size of our model
144 @return None
145
146 """
147 print()
148 print("inputs:")
149 print(input.shape)
150 print(input)
151 print("targets:")
152 print(targs.shape)
153 print(targs)
154 print("---------")
155 # this is how we would traverse this batch tensor (it's basically a matrix)
156 for b in range(batch_size):
157 for t in range(block_size):
158 context = input[b, : t + 1]
159 target = targs[b, t]
160 print(f"when input is {context.tolist()} the target: {target}")
161
162 # seed for this because we want to make it random but reproducable
163
164

Referenced by main().

◆ train_val_split()

tensor_prac.Tensor_Prac.train_val_split ( data)

Splits data into training and validation sets.

Creates a 90%/10% split between training and validation data.

Parameters
dataFull dataset tensor
Returns
Tuple of (train_data, val_data)

Definition at line 94 of file Tensor_Prac.py.

94def train_val_split(data):
95 """
96 @brief Splits data into training and validation sets.
97
98 Creates a 90%/10% split between training and validation data.
99
100 @param data: Full dataset tensor
101 @return Tuple of (train_data, val_data)
102 """
103
104 n = int(0.9 * len(data))
105 train_data = data[:n]
106 val_data = data[n:]
107 return train_data, val_data
108
109

Referenced by main().