Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

createStore and update do not work with frozen objects #53

Open
markerikson opened this issue Apr 10, 2023 · 14 comments · Fixed by #54
Open

createStore and update do not work with frozen objects #53

markerikson opened this issue Apr 10, 2023 · 14 comments · Fixed by #54

Comments

@markerikson
Copy link

markerikson commented Apr 10, 2023

Hi! I'm trying out Signalis after @chriskrycho suggested I take a look at it.

For context, I maintain Redux, and also its associated Reselect memoized selector library.

This last week I was playing around with using some "autotracking" approaches for memoization, based on @glimmer/validator and pzuraq's "minimal autotracking" examples:

Looking at Signalis, it seems very similar to the autotracking code I was borrowing.

I just tried out, but unfortunately I've run into an immediate blocker:

Signalis tries to add a $STOREPROXY symbol to target objects, and this throws an error if the object is frozen:

Object.defineProperty(v, $STOREPROXY, { value: proxy });

That's a problem in our case because Redux Toolkit uses Immer for immutable updates, and Immer auto-freezes all state recursively.

Here's a relatively minimal repro that shows this happening:

import { createStore, update } from '@signalis/core'

const obj1 = {
  value: 0,
}

const obj2 = {
  value: 1,
}

Object.freeze(obj1)
Object.freeze(obj2)

const signalisStore = createStore({
  state: obj1,
})

console.log('State 1: ', signalisStore.state.value)

update(signalisStore, (draft) => {
  draft.state = obj2
})

console.log(signalisStore.state.value)

/*
TypeError: Cannot define property Symbol(signal-store-proxy), object is not extensible
    at Function.defineProperty (<anonymous>)
    at X (D:\Projects\redux\temp\glimmer-tracked-test\node_modules\@signalis\core\dist\signalis-core.umd.cjs:1:7422)
    at Object.get (D:\Projects\redux\temp\glimmer-tracked-test\node_modules\@signalis\core\dist\signalis-core.umd.cjs:1:6766)
    at todo (d:\Projects\redux\temp\glimmer-tracked-test\main.ts:230:40)
*/

I'm pretty interested in seeing how Signalis performs compared to what I've hacked together (mostly by copy-pasting and fiddling with things). Any chance you could find a way to remove the symbol assignment requirement so it works with frozen values?

@markerikson
Copy link
Author

Sweet, thanks! Let me know when you've published a new build and I can try this out again.

@cafreeman
Copy link
Owner

@markerikson Hi there! Thanks for opening up the issue, this is a good find. I think I've found a fairly straightforward fix.

Any chance you could find a way to remove the symbol assignment requirement so it works with frozen values?

The symbol assignment is pretty deeply baked in to how stores work, so I'd like to avoid going down this path if possible. However, I think this should actually be fine.

The repro you gave (thanks btw) was blowing up specifically when you try to use update on a store that contains a frozen object (in other words, the actual store creation on the frozen object should be fine). When update does its thing, it unwraps the store, does its update stuff, and re-wraps it, which is where things were going haywire.

To work around this, I think we'll be okay to just have unwrap check to see if the thing it's unwrapping is frozen and then make a shallow copy. This should allow the rest of the store functionality to work as expected, but will have the one small tradeoff that any frozen object that gets unwrapped will no longer be referentially equal to the original object passed to createStore. That said, I don't imagine that will be an issue given that you're doing immutable stuff already, so referential equality wasn't a factor to begin with.

I'll cut a new release ASAP, and thanks again for the issue :)

@markerikson
Copy link
Author

markerikson commented Apr 11, 2023

Hmm. Interesting.

Does that mean that when we end up reading the final result of whatever derived / selected calculation in userland app code, that the returned values will no longer be the same references that were in the original data? That feels like it could be potentially problematic, but I'd have to do more testing to confirm.

For general point of comparison, the "autotracking" code that I've been playing with in the last few days was half based off of https://github.com/simonihmig/tracked-redux/blob/master/packages/tracked-redux/src/-private/proxy.ts . It works by creating a "storage / cell / tag" container instance for each level of wrapped object nesting, and storing the current value inside that container, rather than adding something on the tracked object.

@cafreeman
Copy link
Owner

Just published v0.0.12 containing these changes if you want to give them a shot.

Does that mean that when we end up reading the final result of whatever derived / selected calculation in userland app code, that the returned values will no longer be the same references that were in the original data?

🤔 I believe it does mean that, since we'd need to shallow copy each frozen object/array during the unwrap. I'm curious to see if this ends up being an issue for your use case.

It works by creating a "storage / cell / tag" container instance for each level of wrapped object nesting, and storing the current value inside that container

This is (mostly) how Signalis stores work as well, but you're right that we do stash the actual proxy on the incoming object (and then remove it during unwrap). I can take a look at seeing what it would take to avoid it entirely though.

@markerikson
Copy link
Author

Yeah, my gut says that making copies of all the original objects would likely be bad for our intended use case, although I can't prove it off the top of my head.

I'll give 0.12 a shot tonight to see if I can at least get the general selector implementation running. If you can think of a way to sidestep the copying and keep the original data as-is, I'd definitely be interested in trying that out!

@cafreeman
Copy link
Owner

I'm super curious to hear more about your specific use case. If you find that this doesn't work and can come up with some kind of repro I'd love to take a look

@markerikson
Copy link
Author

markerikson commented Apr 11, 2023

Sure.

The actual thing that I'm trying to do at the moment is build a new memoization implementation for the Reselect library:

We've got a docs page on the selector pattern overall and Reselect specifically here:

A typical Reselect selector looks like this:

const selectCompletedTodos = createSelector(
  (state: RootState) => state.todos,
  (todos) => todos.filter(t => t.completed)
)

const selectTodosByCategory = createSelector(
  (state: RootState) => state.todos,
  (state: RootState, category: string) => category,
  (todos, category) => todos.filter(t => t.category === category)
)

You pass in one or more "input functions" that extract values from the arguments, which typically just return specific subsets of the data. Typically, a selector is called with the Redux root state object as its first argument, and optionally with additional arguments. Reselect maps over the input functions, passes in all arguments, and the extracted values are passed as the args to the "output function".

Internally, Reselect does a 2-tier memoization approach. It first memoizes on the arguments themselves, so that if you call the selector with the exact same argument references, it's a complete no-op.

Assuming the args changed, it runs the input functions, and then does a second level of memoization on those extracted values.

The default memoization approach primarily relies on shallow equality checks, although the comparison can be customized:

Per my original comment, over the last week I've been playing around with a couple possible alternate memoization approaches. One is an "autotracking" approach, the other is a tiered WeakMap cache. (Both were blatantly copy-pasted from Ember and React as starting points - I'm not smart enough to come up with those ideas myself :) )

I was briefly chatting with Chris Krycho about this and he pointed me to Signalis as a potential alternative to the autotracking code I was using.

On my work machine atm so I don't have the code I was testing out on hand, but off the top of my head it very roughly looked like:

let store = createStore({args: []});

const cache = createDerived(() => {
  const result = userFunction.apply(null, store.args);
  return result;
})

function memoized() {
  update(store, (draft) => {
    draft.args = arguments;
  })

  return cache.value;
}

So, most of the time this is going to end up with args being [someReduxRootState], and occasionally something like [someReduxRootState, "someCategory", {someValue: 123}].

@cafreeman
Copy link
Owner

Ah, this is really interesting. In all honesty, I don't think an actual Signalis store is the right fit here since it's really intended more as an abstraction over the other primitives that makes dealing with more complex data structures less cumbersome.

I see what you're talking about with how tracked-redux very specifically creates a Redux-friendly store by completely divorcing the Proxy from the value itself, but that ends up being less of a general purpose (/ convenient) solution, which is what Signalis is aiming for here.

That said, I do think there's a way to do what you want using the other primitives, though it may look too similar to what you already have:

function sum(values: number[]): number {
  return values.reduce((acc, num) => {
    return acc + num;
  });
}

const store = {
  args: createSignal([]),
};

const cache = createDerived(() => {
  const result = sum.apply(null, store.args.value);
  return result;
});

function memoized() {
  if (store.args === arguments) {
    return cache.value;
  }

  // In real life, we should do this in a for loop over each element in arguments
  if (store.args.value[0] === arguments[0]) {
    console.log('arguments are the same, no recompute');
  } else {
    store.args.value = arguments;
  }

  return cache.value;
}

const input = Object.freeze([1, 2, 3]);

console.log(memoized(input));
console.log(memoized(input));
console.log(memoized(input));
console.log(memoized([...input, 4]));
console.log(memoized(input));

Which would print out:

6
arguments are the same, no recompute
6
arguments are the same, no recompute
6
10
6

However, looking at your original tweet thread, I saw this tweet https://twitter.com/acemarke/status/1645270206208507906 that mentioned a very specific use case (don't recompute a map over an array of Todo IDs when one of the Todos becomes complete). This is something you can absolutely do with Signalis, but I'm not yet sure if it's something that will work in your specific use case. Here's how I'd do that if I were just a dev building an app with Signalis:

const base = [
  {
    id: '1',
    complete: false,
  },
  {
    id: '2',
    complete: true,
  },
];

const store = createStore(base);

const ids = createDerived(() => {
  console.log('computing');
  return store.map((todo) => todo.id);
});

createEffect(() => {
  console.log(ids.value);
});

update(store, (draft) => {
  draft[0].complete = true;
});

update(store, (draft) => {
  draft.push({
    id: '3',
    complete: false,
  });
});

update(store, (draft) => {
  draft[2].complete = true;
});

Which will print out:

computing
[ '1', '2' ]
computing
[ '1', '2', '3' ]

In other words, it'll print once for the initial createEffect and then only print again when we actually add a new Todo, but changing the status of complete for any Todo doesn't trigger a recompute. It would also recompute if we happened to change the ID of an existing Todo.

I don't yet have a full solution for you, but I thought the stuff above might at least provide some useful insight. Maybe you'll see something in there you can run with.

I'll keep thinking about it and will let you know if I come up with anything else

@markerikson
Copy link
Author

Yeah, the problem here is that I'm trying to tie together a non-"reactive" Redux store, that has already generated a new immutably updated state object, with the derivation step :) Kinda trying to get a best-of-both-worlds-ish thing.

In practice, the external usage is going to be selectTodoIds(stateRef1, maybeSomeArgHere), followed later by selectTodoIds(stateRef2, maybeSomeArgHere). The later args are typically consistent over time, although it's not uncommon to pass in new object references. (With the older-style React-Redux connect API, some folks would call selectThing(state, props), and the input functions would grab props.x or props.y. The props object would be a new reference every time, but .x might be stable.) It just so happens that by the time we got down into the internals of the selector, we're dealing with an arguments array-like that now contains [todos, x] or something like that.

So, based on my understanding of signals in general, they normally only hold a single value, and an update is swapping the entire value out.

In this case, what I'm dealing with as a standard use case is state1 vs state, which are going to be different top-level object references that might have 99% identical internal nested references. And that state2 object has already been generated by the time it gets passed into the selector.

So, that's why the "recursively wrap fields in a Proxy"-type approach seems necessary, because all the updates are done, and now we have to essentially recursively diff the fields that got changed between the two state references.

@markerikson
Copy link
Author

I very briefly tried bumping to v0.0.12 and tried out last night's experiment again.

Good news is it runs. Bad news is, it seems like it's recomputing significantly more times than the default, and it's significantly slower.

I've got a test scenario set up that creates 2 sets of 5 selectors. Each set has identical implementations at the app code level, but uses a different memoization approach inside the selectors. I then generate a large number of Redux states, pass each state to each selector, and count how many times they recomputed and how long each set took to execute start-to-finish.

I may try fiddling with this a bit more over the weekend - maybe using createSignal inside of the proxy wrapping logic I already have from the autotrack example would work out better, or something like that.

My current implementation:

import { createStore, createDerived, update } from '@signalis/core'
import {
  createCacheKeyComparator,
  defaultEqualityCheck
} from './defaultMemoize'

const shallowEqual = createCacheKeyComparator(defaultEqualityCheck)

export function signalisMemoize<F extends (...args: any[]) => any>(func: F) {
  // we reference arguments instead of spreading them for performance reasons

  let lastArgs: IArguments | null = null
  const store = createStore({
    args: [] as unknown[]
  })

  const derived = createDerived(() => {
    //console.log('Executing cache: ', store.args)
    const res = func.apply(null, store.args)
    // console.log('Res: ', res)
    return res
  })

  // console.log('Creating memoized function')
  function memoized() {
    if (!shallowEqual(lastArgs, arguments)) {
      update(store, draft => {
        draft.args = arguments as unknown as unknown[]
      })
      lastArgs = arguments
    } else {
      // console.log('Same args: ', lastArgs, arguments)
    }

    return derived.value
  }

  return memoized as F & { clearCache: () => void }
}

The results:

Total recomputations:
cdCounters1 2001
cdCounters2 2001
cdTodoIds 1002
cdTodoIdsAndNames 1002
cdCompletedTodos 1002
cdCompletedTodos2 1002
caCounters1 2001
caCounters2 3002
caTodoIds 3002
caTodoIdsAndNames 3002
caCompletedTodos 3002
caCompletedTodos2 3002
Total elapsed times:  {
  defaultElapsed: 7.483500003814697,
  autotrackElapsed: 391.97940015792847
}

Here's what the usage code looks like, for reference:

describe('More perf comparisons', () => {
  const csDefault = createSelectorCreator(defaultMemoize)
  const csAutotrack = createSelectorCreator(signalisMemoize)

  interface Todo {
    id: number
    name: string
    completed: boolean
  }

  type TodosState = Todo[]

  const counterSlice = createSlice({
    name: 'counters',
    initialState: {
      deeply: {
        nested: {
          really: {
            deeply: {
              nested: {
                c1: { value: 0 }
              }
            }
          }
        }
      },

      c2: { value: 0 }
    },
    reducers: {
      increment1(state) {
        // state.c1.value++
        state.deeply.nested.really.deeply.nested.c1.value++
      },
      increment2(state) {
        state.c2.value++
      }
    }
  })

  const todosSlice = createSlice({
    name: 'todos',
    initialState: [
      { id: 0, name: 'a', completed: false },
      { id: 1, name: 'b', completed: false },
      { id: 2, name: 'c', completed: false }
    ] as TodosState,
    reducers: {
      toggleCompleted(state, action: PayloadAction<number>) {
        const todo = state.find(todo => todo.id === action.payload)
        if (todo) {
          todo.completed = !todo.completed
        }
      },
      setName(state) {
        state[1].name = 'd'
      }
    }
  })

  const store = configureStore({
    reducer: {
      counter: counterSlice.reducer,
      todos: todosSlice.reducer
    },
    middleware: gDM =>
      gDM({
        serializableCheck: false,
        immutableCheck: false
      })
  })

  type RootState = ReturnType<typeof store.getState>

  const states: RootState[] = []

  for (let i = 0; i < 1000; i++) {
    states.push(store.getState())
    store.dispatch(counterSlice.actions.increment1())
    states.push(store.getState())
    store.dispatch(counterSlice.actions.increment2())
    states.push(store.getState())
    store.dispatch(todosSlice.actions.toggleCompleted(1))
    states.push(store.getState())
    store.dispatch(todosSlice.actions.setName())
    states.push(store.getState())
  }

  it('More detailed perf comparison', () => {
    const cdCounters1 = csDefault(
      (state: RootState) =>
        state.counter.deeply.nested.really.deeply.nested.c1.value,
      (state: RootState) => state.counter.c2.value,
      (c1, c2) => {
        return c1 + c2
      }
    )

    const cdCounters2 = csDefault(
      (state: RootState) => state.counter.deeply.nested.really.deeply.nested.c1,
      (state: RootState) => state.counter.c2,
      (c1, c2) => {
        return c1.value + c2.value
      }
    )

    const cdTodoIds = csDefault(
      (state: RootState) => state.todos,
      todos => {
        return todos.map(todo => todo.id)
      }
    )

    const cdTodoIdsAndNames = csDefault(
      (state: RootState) => state.todos,
      todos => {
        return todos.map(todo => ({ id: todo.id, name: todo.name }))
      }
    )

    const cdCompletedTodos = csDefault(
      (state: RootState) => state.todos,
      todos => {
        const completed = todos.filter(todo => todo.completed)
        return completed.length
      }
    )

    const cdCompletedTodos2 = csDefault(
      (state: RootState) => state.todos,
      todos => {
        const completed = todos.filter(todo => todo.completed)
        return completed.length
      }
    )

    const caCounters1 = csDefault(
      (state: RootState) =>
        state.counter.deeply.nested.really.deeply.nested.c1.value,
      (state: RootState) => state.counter.c2.value,
      (c1, c2) => {
        return c1 + c2
      }
    )

    const caCounters2 = csAutotrack(
      (state: RootState) => state.counter.deeply.nested.really.deeply.nested.c1,
      (state: RootState) => state.counter.c2,
      (c1, c2) => {
        // console.log('inside caCounters2: ', { c1, c2 })
        return c1.value + c2.value
      }
    )

    const caTodoIds = csAutotrack(
      (state: RootState) => state.todos,
      todos => {
        return todos.map(todo => todo.id)
      }
    )

    const caTodoIdsAndNames = csAutotrack(
      (state: RootState) => state.todos,
      todos => {
        return todos.map(todo => ({ id: todo.id, name: todo.name }))
      }
    )

    const caCompletedTodos = csAutotrack(
      (state: RootState) => state.todos,
      todos => {
        const completed = todos.filter(todo => todo.completed)
        return completed.length
      }
    )

    const caCompletedTodos2 = csAutotrack(
      (state: RootState) => state.todos,
      todos => {
        const completed = todos.filter(todo => todo.completed)
        return completed.length
      }
    )

    const defaultStart = performance.now()
    for (const state of states) {
      cdCounters1(state)
      cdCounters2(state)
      // console.log('csCounters2', cdCounters2(state))
      cdTodoIds(state)
      cdTodoIdsAndNames(state)
      cdCompletedTodos(state)
      cdCompletedTodos2(state)
    }
    const defaultEnd = performance.now()

    const autotrackStart = performance.now()
    for (const state of states) {
      caCounters1(state)
      caCounters2(state)
      // console.log('State.counter: ', state.counter)
      // console.log('caCounters2', caCounters2(state))
      caTodoIds(state)
      caTodoIdsAndNames(state)
      caCompletedTodos(state)
      caCompletedTodos2(state)
    }
    const autotrackEnd = performance.now()

    const allSelectors = {
      cdCounters1,
      cdCounters2,
      cdTodoIds,
      cdTodoIdsAndNames,
      cdCompletedTodos,
      cdCompletedTodos2,
      caCounters1,
      caCounters2,
      caTodoIds,
      caTodoIdsAndNames,
      caCompletedTodos,
      caCompletedTodos2
    }

    console.log('\nTotal recomputations:')
    Object.entries(allSelectors).forEach(([name, selector]) => {
      console.log(name, selector.recomputations())
    })

    console.log('Total elapsed times: ', {
      defaultElapsed: defaultEnd - defaultStart,
      autotrackElapsed: autotrackEnd - autotrackStart
    })
  })

@cafreeman
Copy link
Owner

This is great. I'm not at a computer right now, but looking at your code I'm guessing that the Signalis implementation just isn't memoizing anything (which seems to be confirmed by how slow it is). When you do store.args = arguments I suspect it's forcing that Derived to recompute every time since it's still just blowing away the old array and replacing it.

I'll see if I can come up with something and run it through your benchmark.

@markerikson
Copy link
Author

Sure, thank you!

I just posted the current WIP over at reduxjs/reselect#605 if you want to take a look.

@cafreeman
Copy link
Owner

cafreeman commented Apr 12, 2023

@markerikson well, the bad news is i haven't beaten your benchmark, but I have gotten things a lot closer with some minor tweaks. Here's what I have so far:

import { createDerived, createSignal } from '@signalis/core'
import {
  createCacheKeyComparator,
  defaultEqualityCheck
} from './defaultMemoize'

const shallowEqual = createCacheKeyComparator(defaultEqualityCheck)

export function signalisMemoize<F extends (...args: any[]) => any>(func: F) {
  // we reference arguments instead of spreading them for performance reasons

  let lastArgs: IArguments | null = null
  const store = {
    args: createSignal<any[]>([])
  }

  const derived = createDerived(() => {
    const res = func.apply(null, store.args.value)
    return res
  })

  // console.log('Creating memoized function')
  function memoized() {
    if (!shallowEqual(lastArgs, arguments)) {
      let didUpdate = false
      const newArray = store.args.value.slice()

      for (let i = 0; i < arguments.length; i++) {
        if (store.args.value[i] !== arguments[i]) {
          didUpdate = true
          newArray[i] = arguments[i]
        }
      }

      if (didUpdate) {
        store.args.value = newArray
      }

      lastArgs = arguments
    } else {
      // console.log('Same args: ', lastArgs, arguments)
    }
    // const start = performance.now()
    // console.log('Calling memoized: ', arguments)

    // const end = performance.now()
    // console.log('Memoized execution time: ', end - start)
    return derived.value
  }

  return memoized as F & { clearCache: () => void }
}

And here's the output:

Total recomputations:
cdCounters1 2001
cdCounters2 2001
cdTodoIds 1002
cdTodoIdsAndNames 1002
cdCompletedTodos 1002
cdCompletedTodos2 1002
caCounters1 2001
caCounters2 2001
caTodoIds 1002
caTodoIdsAndNames 1002
caCompletedTodos 1002
caCompletedTodos2 1002
Total elapsed times:  {
  defaultElapsed: 9.953250005841255,
  autotrackElapsed: 17.18275000154972
}

I'm sure there's a better way to do this that will bring that time down even more (namely making it so that updating slots in that array isn't completely terrible), but this was just with a little bit of poking around.

I'll keep poking at it this week :)

@markerikson
Copy link
Author

Yeah, I think that actually is basically equivalent to how defaultMemoize works out of the box, which is a shallow equality check on the extracted input values (which show up here as arguments in memoized). Definitely let me know what else you figure out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants