multiple heads of self-attention in parallel More...
Public Member Functions | |
__init__ (self, num_heads, head_size) | |
forward (self, x) | |
Public Attributes | |
heads = nn.ModuleList([Head(head_size) for _ in range(num_heads)]) | |
proj = nn.Linear(head_size * num_heads, n_embd) | |
dropout = nn.Dropout(dropout) | |
multiple heads of self-attention in parallel
Definition at line 111 of file working_gpt.py.
working_gpt.MultiHeadAttention.__init__ | ( | self, | |
num_heads, | |||
head_size ) |
Definition at line 114 of file working_gpt.py.
References __init__().
Referenced by __init__().
working_gpt.MultiHeadAttention.forward | ( | self, | |
x ) |
Definition at line 120 of file working_gpt.py.
References working_gpt.Head.dropout, dropout, heads, and proj.
working_gpt.MultiHeadAttention.dropout = nn.Dropout(dropout) |
Definition at line 118 of file working_gpt.py.
Referenced by forward().
working_gpt.MultiHeadAttention.heads = nn.ModuleList([Head(head_size) for _ in range(num_heads)]) |
Definition at line 116 of file working_gpt.py.
Referenced by forward().
working_gpt.MultiHeadAttention.proj = nn.Linear(head_size * num_heads, n_embd) |
Definition at line 117 of file working_gpt.py.
Referenced by forward().