Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tx partitioner #1799

Merged
merged 7 commits into from
Nov 21, 2024
Merged

Add tx partitioner #1799

merged 7 commits into from
Nov 21, 2024

Conversation

aaronbuchwald
Copy link
Collaborator

This PR adds transaction partitioning based on the sponsor address into the DSMR package.

This will be used to partition transactions to be assigned to a specific node for chunk production/verification and targeted transaction gossip.

@aaronbuchwald aaronbuchwald marked this pull request as ready for review November 19, 2024 18:17
@aaronbuchwald aaronbuchwald self-assigned this Nov 19, 2024
Comment on lines 41 to 49
weightedValidators := make([]*weightedValidator, 0, len(vdrs))
totalWeight := uint64(0)
for _, vdr := range vdrs {
weightedValidators = append(weightedValidators, &weightedValidator{
weight: vdr.Weight,
nodeID: vdr.NodeID,
})
totalWeight += vdr.Weight
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be marginally better to implement as :

	weightedValidators := make([]*weightedValidator, len(vdrs))
	totalWeight := uint64(0)
	for i, vdr := range vdrs {
		weightedValidators[i] = &weightedValidator{
			weight: vdr.Weight,
			nodeID: vdr.NodeID,
		}
		totalWeight += vdr.Weight
	}
        ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vdrs is a map, so it would need to be:

	i := 0
	for _, vdr := range vdrs {
		weightedValidators[i] = &weightedValidator{
			weight: vdr.Weight,
			nodeID: vdr.NodeID,
		}
		totalWeight += vdr.Weight
		i++
	}

going to leave this as is since I'd prefer to skip the extra variable

}

func (pp *PrecalculatedPartition[T]) AssignTx(tx T) (ids.NodeID, bool) {
sponsor := tx.GetSponsor()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add

if pp.totalWeight ==0 || len(pp.validators) == 0 {
    return ids.NodeID{}, false
}

return binary.BigEndian.Uint64(sponsor[len(sponsor)-consts.Uint64Len:]) % totalWeight
}

func (pp *PrecalculatedPartition[T]) AssignTx(tx T) (ids.NodeID, bool) {
Copy link
Contributor

@tsachiherman tsachiherman Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the implementation of this function is not very efficient.
instead, in PrecalculatedPartition we want to have an array called
accumulatedTotalWeights where each elements would contain the total accumulated weight until that validator.
Then, in AssignTx, we want to perform a binary search ( O(logn) ) instead of array scan ( O(n) ).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea

Comment on lines 39 to 65
func getWeightedVdrsFromState(ctx context.Context, state validators.State, pChainHeight uint64, subnetID ids.ID) ([]*weightedValidator, error) {
vdrs, err := state.GetValidatorSet(ctx, pChainHeight, subnetID)
if err != nil {
return nil, err
}
weightedValidators := make([]*weightedValidator, 0, len(vdrs))
for _, vdr := range vdrs {
weightedValidators = append(weightedValidators, &weightedValidator{
weight: vdr.Weight,
nodeID: vdr.NodeID,
})
}
return weightedValidators, nil
}

func precomputePartition[T Tx](validators []*weightedValidator) *PrecalculatedPartition[T] {
utils.Sort(validators)
accumulatedWeight := uint64(0)
for _, weightedVdr := range validators {
accumulatedWeight += weightedVdr.weight
weightedVdr.accumulatedWeight = accumulatedWeight
}
return &PrecalculatedPartition[T]{
validators: validators,
totalWeight: accumulatedWeight,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should just merge these two functions... or just inline all of this code into CalculatePartition

Comment on lines 67 to 74
func CalculatePartition[T Tx](ctx context.Context, state validators.State, pChainHeight uint64, subnetID ids.ID) (*PrecalculatedPartition[T], error) {
weightedValidators, err := getWeightedVdrsFromState(ctx, state, pChainHeight, subnetID)
if err != nil {
return nil, err
}

return precomputePartition[T](weightedValidators), nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea is to not accept the validators.State interface which is a vm/consensus abstraction. We could just accept a slice of []Validator where Validator is a struct of a node id + weight so this signature is just CalculatePartition(context.Context, []Validator) which makes the unit tests a bit nicer since we won't have to use the awkward validators.State interface which requires a lot of boiler-plate setup. The tradeoff is that the VM would have to copy into the validator type that dsmr defines though.

Another idea would be to just depend on map[ids.NodeID]*GetValidatorsOutput instead of this interface.

}

func (pp *PrecalculatedPartition[T]) AssignTx(tx T) (ids.NodeID, bool) {
if pp.totalWeight == 0 || len(pp.validators) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't pp.totalWeight == 0 the same as len(pp.validators) == 0? You can only have a positive non-zero amount of stake on a validator so it seems weird that we check for both cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reality, we should never have either case. This is defensive, I don't feel too strongly and am happy to remove either or both.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed both checks

Comment on lines 91 to 94
// Defensive: this should never happen
if nodeIDIndex >= len(pp.validators) {
return ids.NodeID{}, false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just panic and break loudly as opposed to quietly continuing to operate in a state we shouldn't be in? Wondering if leaving this check will mask a problem in the future if this does end up breaking since this case shouldn't be possible anyways...

Also sort.Search only returns len(slice) if it's not found, so we should be checking for == instead of >=.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

return indexedTxs
}

func (pp *PrecalculatedPartition[T]) AssignTxs(txs []T) (map[ids.NodeID][]T, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need both this and AssignTx? Can't we just call AssignTx in a loop if it wanted this behavior?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is faster, but could be premature optimization

return assignments, nil
}

func (pp *PrecalculatedPartition[T]) FilterTxs(nodeID ids.NodeID, txs []T) ([]T, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we also get rid of this function as well? Seems like filter could just be implemented by calling AssignTx in a loop over txs and just filtering out anything not returning nodeID.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

return filteredTxs, nil
}

type Partition[T Tx] struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe we call this PartitionCache?
Q: I'm also not sure how this is going to be used so I'm wondering if this should live in dsmr or if this should live in the vm.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This belongs in DSMR imo because transactions need to be partitioned to efficiently build chunks with non-overlapping transactions.

This is sort of part of "fortification," but even without malicious users/validators, we need to load balance which validators include transactions in their chunks to make this efficient.

}

type PrecalculatedPartition[T Tx] struct {
validators []*weightedValidator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pointer seems like overkill for this since it's such a small struct... but I don't have performance data to say why it's right or wrong... so feel free to ignore this comment.

}
}

func CalculatePartition[T Tx](ctx context.Context, state validators.State, pChainHeight uint64, subnetID ids.ID) (*PrecalculatedPartition[T], error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Personal preference but I generally dislike calling constructors anything other than New*.

return binary.BigEndian.Uint64(sponsor[len(sponsor)-consts.Uint64Len:]) % totalWeight
}

func (pp *PrecalculatedPartition[T]) AssignTx(tx T) (ids.NodeID, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pp -> p (I just use the first letter of the type almost religiously)

accumulatedWeight uint64
}

type PrecalculatedPartition[T Tx] struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Don't care where we put this struct definition but I prefer defining the struct next to adjacent to wherever its corresponding functions live.
  2. Regarding naming, PrecalculatedPartition feels awkward to me because I don't care if it's pre-calculated or not as the caller... should we just call this something like Partitioner since it's the thing that gives you a partition?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing for now to just define the partition function

Comment on lines +155 to +160
for i, vdr := range partition.validators {
if vdr.nodeID == nodeID {
foundNodeIDIndex = i
break
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use sort.Search instead of this?

@aaronbuchwald aaronbuchwald merged commit 7ab827d into main Nov 21, 2024
17 checks passed
@aaronbuchwald aaronbuchwald deleted the dsmr-partition branch November 21, 2024 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants