Replies: 1 comment
-
|
Hey @LinyeLi60 this is a valid concern, thanks for raising it. The reason we don't have a benchmark in the paper is because there's none that we have found to be exhaustive on this topic. This is not a common feature in AI memory systems. We do believe there's need to show up values in a comprehensive benchmarks and we might come out with some in the next weeks/months. The CARA framework's goal is to give Agents coherent vision on topics overtime which is something that we believe is fundamental for AI Agents that are enhancing human tipical jobs (AI PM, AI Analyst...) If you have any idea or experience, that would be more than welcome here :) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Dear Authors,
Thank you for your insightful work on Hindsight. I found the idea of explicitly modeling subjective opinion memory particularly interesting. One question I had is about evaluation: while the paper emphasizes the importance of subjective opinion memory, I did not find a concrete metric or benchmark specifically designed to assess its quality or effectiveness.
Could you share your thoughts on how subjective opinion memory should ideally be evaluated? For example, do you see it as requiring new benchmarks, human-in-the-loop evaluation, or extensions of existing memory and preference modeling tasks?
Beta Was this translation helpful? Give feedback.
All reactions