Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Hash + Ladder Agent #3697

Merged
merged 25 commits into from
Jun 9, 2021
Merged

Hash + Ladder Agent #3697

merged 25 commits into from
Jun 9, 2021

Conversation

jaseweston
Copy link
Contributor

No description provided.

@jaseweston jaseweston requested a review from jxmsML June 9, 2021 13:47
Copy link
Contributor

@spencerp spencerp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some code that can be consolidated, but I didn't see any red flags. Accepting to unblock and I can do a cleanup/consolidation pass later.

projects/params_vs_compute/hash_ladder/hash_ladder.py Outdated Show resolved Hide resolved
projects/params_vs_compute/hash_ladder/hash_ladder.py Outdated Show resolved Hide resolved
dropout_frac = opt.get('dropout', 0.0)
self.dropout = nn.Dropout(p=dropout_frac) # --dropout

self.n_positions = _default(n_positions, get_n_positions_from_options(opt))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like many of these properties are the same as TransformerDecoder and you already call super().init(). Perhaps you only need the layer building code here.

I can add a build_layers function to TransformerDecoder in a subsequent PR for overriding purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i'll look at these later after this initial merge, thanks

Comment on lines +324 to +330
self.ffn = HashLayerFFN(
opt,
embedding_size,
ffn_size,
relu_dropout=relu_dropout,
activation=activation,
) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here; it looks like we're only overloading HashLayerFFN? If so, you can just create your TransformerDecoderLayer like this without overriding:

TransformerDecoderLayer.with_components(feedforward=HashLayerFFN)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i'll look at these later after this initial merge, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but i think i can't do that because i need to pass through the original inputs in the decoder layer)

@jaseweston jaseweston merged commit a49421f into master Jun 9, 2021
@jaseweston jaseweston deleted the parrotz10 branch June 9, 2021 14:51
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants