Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make old-timestep eviction less surprising #24

Open
philip-davis opened this issue Jul 13, 2017 · 5 comments
Open

Make old-timestep eviction less surprising #24

philip-davis opened this issue Jul 13, 2017 · 5 comments
Assignees

Comments

@philip-davis
Copy link
Owner

Two aspects of old timestep eviction are surprising (to me at least):

  1. Eviction is performed without consideration of whether it is required. This is fine if version always increments by one, but suppose we have max_versions = 5, and the following versions are written:
    0 2 4 5 7

Here we wrote 5 time steps, so as a user I would expect all of them to still be stored. But it is not so! Five overwrote 0 and 7 overwrote 2. So I only have three timesteps in dataspaces after this sequence of puts.

  1. Eviction is based upon overlap of bounding boxes. Suppose max_versions = 1. Consider the following three scenarios:

a. {version:0, lb:{0,0}, ub:{3,3}} followed by {version: 1, lb:{0,0}, ub:{3,3}}. The first put is evicted and replaced. So far, so good.
b. {version:0, lb:{0,0}, ub:{3,3} } followed by {version: 1, lb:{4,4}, ub:{7,7}}. Nothing is evicted, both are in memory. This seems to violate "max_versions = 1".
c. {version:0, lb:{0,0}, ub:{3,3} } followed by {version: 1, lb:{3,3}, ub:{6,6}}. The first put is evicted and replaced. This honors max_versions, but inconsistent with b.

Fixing 1 seems to just require adding some logic when checking for eviction. We only want to evict if we actually have max_versions objects stored that are eligible for eviction otherwise.

Fixing 2 gets at a semantics issue of what we mean by max_version. So do we mean that we will hold max_version versions of each element of each variable, or do we mean that we will hold elements for up to max_threads, or something else?

@philip-davis philip-davis self-assigned this Jul 13, 2017
@philip-davis
Copy link
Owner Author

This sequence of puts is also concerning:

d. {version:0, lb:{0,0}, ub:{3,3}} followed by {version: 0, lb:{3,3}, ub:{3,3}} The first put will be evicted, and only the second put remains. It seems to be that the second put should have resulted in the update of the element at [3][3], and that eviction should not come into play at all here.

@melrom
Copy link
Collaborator

melrom commented Jul 14, 2017

There are two things being discussed here and they need to be separated.

  1. Logical Eviction
  2. Physical Eviction

For logical eviction, there are two criteria: variable name and bounding box. Max versions does not mean that there is only 1 version of data for the entire data domain. That doesn't make sense because a lot of scientific applications work on separate regions. We don't want to clear out timestep 0 data in lb:{0,0}, ub:{3,3} if the application in timestep 1 is operating on region lb:{4,4}, ub:{7,7}. The two regions have no overlap with one another. The data in region (0,3)-(3,3) is still valid data in the entire data domain.

In c and d, the bounding box at 3,3 overlaps, which triggers the eviction. The reason the entire region is evicted is mostly just because it was easiest to code -- we have the metadata info that there IS overlap -- however, if we wanted to only 'evict' element (3,3), we would have to find out what server has the data object, copy the object back into the form relevant to the application, and replace a single element of it.

In part b, you start bringing up "both are in memory" and that's starting to refer to physical eviction. The servers that are responsible for keeping track of the data don't necessarily have the data in their physical memory. The eviction process is mostly at the DHT level only (i.e., metadata) unless the physical data object is on the same server. I've always found this to be somewhat irritating because it means we are essentially taking up space in memory for data objects that the user has already deemed as being irrelevant to them and that even DataSpaces no longer knows about but the memory itself is not cleared or freed.

@parasharmanish
Copy link
Collaborator

parasharmanish commented Jul 14, 2017 via email

@philip-davis
Copy link
Owner Author

philip-davis commented Jul 14, 2017 via email

@philip-davis
Copy link
Owner Author

Specifically, this sequence:
*1. put:{version:0, lb:{0,0}, ub:{1,1}}
2. put:{version:0, lb:{1,1}, ub:{1,1}}
3. get:{version:0, lb:{0,0}, ub:{0,0}} [error]

will have different results than this sequence:

  1. put:{version:0, lb:{0,0}, ub:{0,1}}
    *2. put:{version:0, lb:{1,0}, ub:{1,1}}
  2. put:{version:0, lb:{1,1}, ub:{1,1}}
  3. get:{version:0, lb:{0,0}, ub:{0,0}} [no problem]

Even though it seems intuitive for the system to be in the same state after the starred operation for each sequence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants