Optional apply logprob computation at call site instead of construction site

This issue is meant to track the possibility of some workaround to get QE to be optional at run time, as opposed to construction time (`skip-cost= true`).

Trace:

`skip-cost`:

https://github.com/browsermt/marian-dev/blob/08b1544636fe13eaf1fbacb17c6fb050abfb8d42/src/translator/scorers.cpp#L41-L43

`createModelFromOptions`:

https://github.com/browsermt/marian-dev/blob/08b1544636fe13eaf1fbacb17c6fb050abfb8d42/src/models/model_factory.cpp#L370-L376

`StepWise`:

https://github.com/browsermt/marian-dev/blob/08b1544636fe13eaf1fbacb17c6fb050abfb8d42/src/models/costs.h#L309-L313

`StepWise` Relevant call site: 
https://github.com/browsermt/marian-dev/blob/08b1544636fe13eaf1fbacb17c6fb050abfb8d42/src/models/costs.h#L360-L369

If I insert a bool `skipCost` defaulting to false as part of the arguments here and ignore the cost operation if `skipCost=true` and trigger the param via beamsearch (see below), there is a possibility?

Call-site:

https://github.com/browsermt/marian-dev/blob/08b1544636fe13eaf1fbacb17c6fb050abfb8d42/src/translator/beam_search.cpp#L421

https://github.com/browsermt/marian-dev/blob/08b1544636fe13eaf1fbacb17c6fb050abfb8d42/src/translator/scorers.h#L129-L138


	// add (log)softmax if requested
	if (use == usage::translation) {
	if(std::dynamic_pointer_cast<EncoderDecoder>(baseModel)) {
	if(options->get<bool>("output-sampling", false))
	return New<Stepwise>(std::dynamic_pointer_cast<EncoderDecoder>(baseModel), New<GumbelSoftmaxStep>());
	else
	return New<Stepwise>(std::dynamic_pointer_cast<EncoderDecoder>(baseModel), New<LogSoftmaxStep>());

	// class to wrap an IEncoderDecoder and a ILogProbStep that are executed in sequence,
	// wrapped again in the IEncoderDecoder interface
	// @TODO: seems we are conflating an interface defition with its implementation?
	// @TODO: needs a better name. Stepwise is an adjective. Classes are things=nouns. StepwiseWhat?
	class Stepwise : public IEncoderDecoder {

	virtual Ptr<DecoderState> step(Ptr<ExpressionGraph> graph,
	Ptr<DecoderState> state,
	const std::vector<IndexType>& hypIndices, // [beamIndex * activeBatchSize + batchIndex]
	const Words& words, // [beamIndex * activeBatchSize + batchIndex]
	const std::vector<IndexType>& batchIndices, // [batchIndex]
	int beamSize) override {
	auto nextState = encdec_->step(graph, state, hypIndices, words, batchIndices, beamSize);
	return cost_->apply(nextState);
	}

	virtual Ptr<ScorerState> step(Ptr<ExpressionGraph> graph,
	Ptr<ScorerState> state,
	const std::vector<IndexType>& hypIndices,
	const Words& words,
	const std::vector<IndexType>& batchIndices,
	int beamSize) override {
	graph->switchParams(getName());
	auto wrapperState = std::dynamic_pointer_cast<ScorerWrapperState>(state);
	auto newState = encdec_->step(graph, wrapperState->getState(), hypIndices, words, batchIndices, beamSize);
	return New<ScorerWrapperState>(newState);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optional apply logprob computation at call site instead of construction site #71

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	bool skipCost = options->get<bool>("skip-cost");
	auto encdec = models::createModelFromOptions(
	options, skipCost ? models::usage::raw : models::usage::translation);

Optional apply logprob computation at call site instead of construction site #71

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions