Skip to content

Commit dc4da21

Browse files
committed
headings
1 parent bb84c7a commit dc4da21

File tree

17 files changed

+56
-0
lines changed

17 files changed

+56
-0
lines changed

labml_nn/cfr/__init__.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@
6868
i.e. all those histories look the same in the eye of the player.
6969
7070
<a id="Strategy"></a>
71+
7172
### Strategy
7273
7374
**Strategy of player** $i$, $\sigma_i \in \Sigma_i$ is a distribution over actions $A(I_i)$,
@@ -84,6 +85,7 @@
8485
$\sigma_{-i}$ is strategies of all players except $\sigma_i$
8586
8687
<a id="HistoryProbability"></a>
88+
8789
### Probability of History
8890
8991
$\pi^\sigma(h)$ is the probability of reaching the history $h$ with strategy profile $\sigma$.
@@ -109,6 +111,7 @@
109111
$$u_i(\sigma) = \sum_{h \in Z} u_i(h) \pi^\sigma(h)$$
110112
111113
<a id="NashEquilibrium"></a>
114+
112115
### Nash Equilibrium
113116
114117
Nash equilibrium is a state where none of the players can increase their expected utility (or payoff)
@@ -204,6 +207,7 @@
204207
So we need to minimize $R^T_i$ to get close to a Nash equilibrium.
205208
206209
<a id="CounterfactualRegret"></a>
210+
207211
### Counterfactual regret
208212
209213
**Counterfactual value** $\color{pink}{v_i(\sigma, I)}$ is the expected utility for player $i$ if
@@ -235,6 +239,7 @@
235239
where $$R^{T,+}_{i,imm}(I) = \max(R^T_{i,imm}(I), 0)$$
236240
237241
<a id="RegretMatching"></a>
242+
238243
### Regret Matching
239244
240245
The strategy is calculated using regret matching.
@@ -271,6 +276,7 @@
271276
therefore reaches $\epsilon$-[Nash equilibrium](#NashEquilibrium).
272277
273278
<a id="MCCFR"></a>
279+
274280
### Monte Carlo CFR (MCCFR)
275281
276282
Computing $\color{coral}{r^t_i(I, a)}$ requires expanding the full game tree
@@ -331,6 +337,7 @@
331337
class History:
332338
"""
333339
<a id="History"></a>
340+
334341
## History
335342
336343
History $h \in H$ is a sequence of actions including chance events,
@@ -404,6 +411,7 @@ def __repr__(self):
404411
class InfoSet:
405412
"""
406413
<a id="InfoSet"></a>
414+
407415
## Information Set $I_i$
408416
"""
409417

labml_nn/conv_mixer/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@
4545
class ConvMixerLayer(Module):
4646
"""
4747
<a id="ConvMixerLayer"></a>
48+
4849
## ConvMixer layer
4950
5051
This is a single ConvMixer layer. The model will have a series of these.
@@ -100,6 +101,7 @@ def forward(self, x: torch.Tensor):
100101
class PatchEmbeddings(Module):
101102
"""
102103
<a id="PatchEmbeddings"></a>
104+
103105
## Get patch embeddings
104106
105107
This splits the image into patches of size $p \times p$ and gives an embedding for each patch.
@@ -139,6 +141,7 @@ def forward(self, x: torch.Tensor):
139141
class ClassificationHead(Module):
140142
"""
141143
<a id="ClassificationHead"></a>
144+
142145
## Classification Head
143146
144147
They do average pooling (taking the mean of all patch embeddings) and a final linear transformation

labml_nn/experiments/mnist.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
class MNISTConfigs(MNISTDatasetConfigs, TrainValidConfigs):
2525
"""
2626
<a id="MNISTConfigs"></a>
27+
2728
## Trainer configurations
2829
"""
2930

labml_nn/experiments/nlp_autoregression.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ def forward(self, outputs, targets):
4141
class NLPAutoRegressionConfigs(TrainValidConfigs):
4242
"""
4343
<a id="NLPAutoRegressionConfigs"></a>
44+
4445
## Trainer configurations
4546
4647
This has the basic configurations for NLP auto-regressive task training.

labml_nn/experiments/nlp_classification.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
class NLPClassificationConfigs(TrainValidConfigs):
3030
"""
3131
<a id="NLPClassificationConfigs"></a>
32+
3233
## Trainer configurations
3334
3435
This has the basic configurations for NLP classification task training.

labml_nn/gan/stylegan/__init__.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,7 @@
158158
class MappingNetwork(nn.Module):
159159
"""
160160
<a id="mapping_network"></a>
161+
161162
## Mapping Network
162163
163164
![Mapping Network](mapping_network.svg)
@@ -196,6 +197,7 @@ def forward(self, z: torch.Tensor):
196197
class Generator(nn.Module):
197198
"""
198199
<a id="generator"></a>
200+
199201
## StyleGAN2 Generator
200202
201203
![Generator](style_gan2.svg)
@@ -276,6 +278,7 @@ def forward(self, w: torch.Tensor, input_noise: List[Tuple[Optional[torch.Tensor
276278
class GeneratorBlock(nn.Module):
277279
"""
278280
<a id="generator_block"></a>
281+
279282
### Generator Block
280283
281284
![Generator block](generator_block.svg)
@@ -327,6 +330,7 @@ def forward(self, x: torch.Tensor, w: torch.Tensor, noise: Tuple[Optional[torch.
327330
class StyleBlock(nn.Module):
328331
"""
329332
<a id="style_block"></a>
333+
330334
### Style Block
331335
332336
![Style block](style_block.svg)
@@ -377,6 +381,7 @@ def forward(self, x: torch.Tensor, w: torch.Tensor, noise: Optional[torch.Tensor
377381
class ToRGB(nn.Module):
378382
"""
379383
<a id="to_rgb"></a>
384+
380385
### To RGB
381386
382387
![To RGB](to_rgb.svg)
@@ -489,6 +494,7 @@ def forward(self, x: torch.Tensor, s: torch.Tensor):
489494
class Discriminator(nn.Module):
490495
"""
491496
<a id="discriminator"></a>
497+
492498
## StyleGAN 2 Discriminator
493499
494500
![Discriminator](style_gan2_disc.svg)
@@ -557,6 +563,7 @@ def forward(self, x: torch.Tensor):
557563
class DiscriminatorBlock(nn.Module):
558564
"""
559565
<a id="discriminator_black"></a>
566+
560567
### Discriminator Block
561568
562569
![Discriminator block](discriminator_block.svg)
@@ -645,6 +652,7 @@ def forward(self, x: torch.Tensor):
645652
class DownSample(nn.Module):
646653
"""
647654
<a id="down_sample"></a>
655+
648656
### Down-sample
649657
650658
The down-sample operation [smoothens](#smooth) each feature channel and
@@ -668,6 +676,7 @@ def forward(self, x: torch.Tensor):
668676
class UpSample(nn.Module):
669677
"""
670678
<a id="up_sample"></a>
679+
671680
### Up-sample
672681
673682
The up-sample operation scales the image up by $2 \times$ and [smoothens](#smooth) each feature channel.
@@ -690,6 +699,7 @@ def forward(self, x: torch.Tensor):
690699
class Smooth(nn.Module):
691700
"""
692701
<a id="smooth"></a>
702+
693703
### Smoothing Layer
694704
695705
This layer blurs each channel
@@ -729,6 +739,7 @@ def forward(self, x: torch.Tensor):
729739
class EqualizedLinear(nn.Module):
730740
"""
731741
<a id="equalized_linear"></a>
742+
732743
## Learning-rate Equalized Linear Layer
733744
734745
This uses [learning-rate equalized weights](#equalized_weights) for a linear layer.
@@ -755,6 +766,7 @@ def forward(self, x: torch.Tensor):
755766
class EqualizedConv2d(nn.Module):
756767
"""
757768
<a id="equalized_conv2d"></a>
769+
758770
## Learning-rate Equalized 2D Convolution Layer
759771
760772
This uses [learning-rate equalized weights](#equalized_weights) for a convolution layer.
@@ -784,6 +796,7 @@ def forward(self, x: torch.Tensor):
784796
class EqualizedWeight(nn.Module):
785797
"""
786798
<a id="equalized_weight"></a>
799+
787800
## Learning-rate Equalized Weights Parameter
788801
789802
This is based on equalized learning rate introduced in the Progressive GAN paper.
@@ -821,6 +834,7 @@ def forward(self):
821834
class GradientPenalty(nn.Module):
822835
"""
823836
<a id="gradient_penalty"></a>
837+
824838
## Gradient Penalty
825839
826840
This is the $R_1$ regularization penality from the paper
@@ -862,6 +876,7 @@ def forward(self, x: torch.Tensor, d: torch.Tensor):
862876
class PathLengthPenalty(nn.Module):
863877
"""
864878
<a id="path_length_penalty"></a>
879+
865880
## Path Length Penalty
866881
867882
This regularization encourages a fixed-size step in $w$ to result in a fixed-magnitude

labml_nn/optimizers/adam_warmup_cosine_decay.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
class AdamWarmupCosineDecay(AMSGrad):
1919
"""
2020
<a id="EmbeddingsWithPositionalEncoding"></a>
21+
2122
## Adam Optimizer with Warmup and Cosine Decay
2223
2324
This class extends from AMSGrad optimizer defined in [`amsgrad.py`](amsgrad.html).

labml_nn/optimizers/configs.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
class OptimizerConfigs(BaseConfigs):
1919
"""
2020
<a id="OptimizerConfigs"></a>
21+
2122
## Optimizer Configurations
2223
"""
2324

labml_nn/resnet/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ def forward(self, x: torch.Tensor):
9191
class ResidualBlock(Module):
9292
"""
9393
<a id="residual_block"></a>
94+
9495
## Residual Block
9596
9697
This implements the residual block described in the paper.
@@ -157,6 +158,7 @@ def forward(self, x: torch.Tensor):
157158
class BottleneckResidualBlock(Module):
158159
"""
159160
<a id="bottleneck_residual_block"></a>
161+
160162
## Bottleneck Residual Block
161163
162164
This implements the bottleneck block described in the paper.

labml_nn/transformers/configs.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
class FeedForwardConfigs(BaseConfigs):
2222
"""
2323
<a id="FFN"></a>
24+
2425
## FFN Configurations
2526
2627
Creates a Position-wise FeedForward Network defined in
@@ -143,6 +144,7 @@ def _feed_forward(c: FeedForwardConfigs):
143144
class TransformerConfigs(BaseConfigs):
144145
"""
145146
<a id="TransformerConfigs"></a>
147+
146148
## Transformer Configurations
147149
148150
This defines configurations for a transformer.

0 commit comments

Comments
 (0)