Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix calendar example #72

Merged
merged 3 commits into from
Jul 18, 2024

Conversation

austinmw
Copy link
Contributor

@austinmw austinmw commented Jul 15, 2024

Fixes #66

Some of these changes are definitely bugs preventing intended behavior from working, while others are improvements to the existing code to prevent edge cases I discovered. I separated the changes to examples/phone from the other changes in two separate commits in case you'd only like the former.

phone/components/scene.py

  • Fix _PHONE_CALL_TO_ACTION format variable to name instead of agent_name (this caused phone episodes to always fail before executing any phone actions)
  • Fix _PHONE_ACTION_SPEC to use OutputType.FREE instead of 'FREE' (which has no matching condition in player.act())
  • Update did_conclude chain of thought to be more descriptive (I was seeing it unable to understand that a task was complete and therefore running for the full 20 steps)
  • Update _PHONE_CALL_TO_ACTION so it considers the most recent observation (I was seeing that if the most recent event was "Alice schedules meeting with Bob using smartphone", this prompt sees that as already done, so picks something else to do instead.) I think this wasn't needed previously because another bug was causing phone triggering events to never be observed?

phone/components/apps.py

  • Add _print method to PhoneApp
  • Add _print statement to ToyCalendar.add_meeting
  • Add check for unexpected arguments to PhoneApp.invoke_action (to prevent errors in return getattr(self, action.name)(**args) if a unexpected arg is provided)
  • Update Phone.description to: "has only the following apps available" (since I was getting events like "Bob uses the Camera app to take photographs", which triggered the phone scene, but then scheduled an irrelevant meeting)
  • Add ToyCalendar.check_calendar app action (since previously I was seeing that an event like, "Alice uses the Calendar app to check her schedule for the day and confirm her appointment with Bob." could only route to add_meeting, so it would repeat adding the same meeting). There's probably a better way to check to see if none of the app actions make sense in the context of the event and do nothing instead.

phone/components/triggering.py

  • Make phone triggering yes_no_question clearer in _is_phone_event and _get_player_from_event (I was seeing events like, "Alice sent a message to bob to schedule meeting" not being recognized as a phone action)

phone/calendar.ipynb

  • Add project root to python path to fix examples.phone.components import issue
  • Fix START_TIME to be at SETUP_TIME instead of one hour after (since Alice's plan typically mentions scheduling the meeting at SETUP_TIME [8 AM], but the simulation was starting at 9 AM, so she seemed to think she already did that and no longer needs to)
  • Change major clock step size to be 15 minutes instead of 1 hour (since Alice's plan typically involves taking 15 minutes to schedule the meeting, so if step size is much larger than that, she seems to skip this activity and instead choose plans with longer durations [like eating breakfast])
  • Add missing clock_now to SimIdentity
  • Remove unused victim variable
  • Update utterance_from_user to be more explicit (since Alice was giving false positive confirmation by confusing plans for events)
  • Change ObservationSummary.timeframe_delta_until to 15 minutes to match clock step size

Note: During the interrogation phase, since after 12 episodes, the current observations will not be far enough back in time to recall scheduling the meeting, and the summary of observations may skip this detail, Alice may answer that she doesn't recall whether she has successfully scheduled the meeting. Although not ideal, this seems better in my opinion than her just assuming she scheduled it because it's in her plan, as was previously happening.

Reminder: Need to comment out the pip installs in this notebook to preserve the below changes (the thought_chains modification appeared necessary in my testing to prevent edge cases)

observation.py

  • in pre_observe and observe, strip whitespace from observation to improve "[observation]" formatting

thought_chains.py

  • Changed thought_chains.result_to_who_what_where's open_question from "who the event is about" to "the main person the event is about" (since I was previously seeing outputs like, e.g., "Bob is eating breakfast while Alice is scheduling a meeting", which resulted in triggering a 2nd phone event for Alice, even though Alice had already just scheduled the event and bob is the one currently acting)

@austinmw austinmw changed the title Fixes #66 calendar example Fix calendar example Jul 15, 2024
@austinmw austinmw force-pushed the fix-calendar-example branch from bdaa9b1 to 2842ae3 Compare July 15, 2024 17:50
@austinmw
Copy link
Contributor Author

hey, not quite sure about the two failures due to Name 'get_ipython' is not defined here

@jzleibo
Copy link
Collaborator

jzleibo commented Jul 18, 2024

I suspect that it's just a linter issue blocking this from submitting. I'll try and follow the steps we have on our side to fix the issue now so that this can get merged.

@jzleibo
Copy link
Collaborator

jzleibo commented Jul 18, 2024

actually, there seems to be something different going on with this PR. I think the system sees it as having a deeper error, so it's not letting me do the next step in our process to merge it. This will take a bit more investigation I think.. Bear with us us while we figure it out. Sorry for the delay

@copybara-service copybara-service bot merged commit ae4b7b5 into google-deepmind:main Jul 18, 2024
8 of 12 checks passed
@austinmw
Copy link
Contributor Author

Thanks!

@jzleibo
Copy link
Collaborator

jzleibo commented Jul 19, 2024

This is merged now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calendar.ipynb example issues
2 participants