Replies: 2 comments
-
|
You might find it useful to clone the codex repo locally and then ask codex questions about its own implementation. I realize that sounds a little recursive, but it's a pretty effective way to get detailed answers to questions about a code base. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks Eric. After all the digging and chatting with chatgpt, here is the summary. I am pasting here, if someone is looking for same information later. In interactive Codex sessions, conversation history is not stored permanently on OpenAI’s servers in the way ChatGPT retains chat history. The only persistent session state that exists is on the client side. Codex writes local rollout files (~/.codex/sessions/...) that contain the full transcript of the session. These files can be resumed, forked, and compacted by the user’s local client to continue conversations — but this persistence lives entirely on the user’s machine, not on the server. OpenAI does implement a prompt caching mechanism on the server that can reuse intermediate computation for identical prefixes between requests. This cache can reduce cost and latency by avoiding recomputation of the same prefix tokens, but it is not a semantic or durable memory of the conversation. The server’s cache is ephemeral, may be evicted under load or inactivity, and cannot reconstruct prior context if the client does not include it in the prompt. Even with optional extended retention (e.g., a 24-hour cache hint in the API), this caching is strictly for performance and not a guarantee of session memory. Every Codex request still must send the full prompt (including any prefix and relevant history) each turn. The server uses the prompt_cache_key and prefix tokens to potentially reuse cached computation, but it does not maintain conversational state that can be resumed with an ID alone. This means that context is always client-driven: the client reconstructs history and composes prompts every time. Server caching only helps with computational efficiency when the same prefix is frequently reused and has not yet been evicted. In summary, Codex’s server retains cached computation for a short period (minutes to at most hours, depending on configuration), but it does not retain conversation context in a resumable or durable way. True context persistence across sessions comes only from local files managed by the client, and clients must always send context explicitly for the model to “remember” anything. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I want to understand how server-side context retention works as I am building an app with codex as a worker in BE. Since reusing conversation is token friendly, I am trying to understand the following:
If so, what are the retention rules or expiration? E.g. how long it remains available after last interaction.
Is the “auto-compact” context behavior a client-only mechanism, or does the server also compact or retain context across requests?
When making new Codex requests, does the server ever use previous request context without it being sent by the client?
Thanks in advance for your help.
Beta Was this translation helpful? Give feedback.
All reactions