Security: Remediation of RCE via Insecure Pickle Deserialization and Hardening of I/O URIs#404
Open
JoshuaProvoste wants to merge 1 commit intogoogle:mainfrom
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Compliance with CONTRIBUTING.md
Consistent with the project's security and quality guidelines, this PR includes:
pyglove/core/utils/json_conversion.pyandpyglove/core/coding/execution.py.__reduce__payloads and verified their neutralization via the new security gates (tested viaverify_security.py).isortandpyinkconventions).Description
This PR addresses four critical security vulnerabilities in PyGlove's serialization, I/O, and IPC systems. The existing implementation relied on
picklefor_OpaqueObjectserialization and inter-process result transfer insandbox_call, which is inherently insecure and allows for Remote Code Execution (RCE). Furthermore, the I/O layer lacked protection against unauthorized remote URI loading.This refactoring transitions the framework to a "Secure by Default" architecture by replacing
picklewith Msgpack (a data-only format) for all automated serialization tasks and implementing explicit security gates for remote resource access.Key Security Improvements:
picklewithmsgpackin_OpaqueObject. Msgpack is a non-executable, type-safe binary format that prevents system-level takeover during data loading.http://,s3://) are now blocked by default, requiring explicit user opt-in viaallow_remote=True.sandbox_callinexecution.pyto transfer results usingmsgpackand PyGlove's symbolic JSON model, eliminating the possibility of sandbox escapes via malicious object deserialization in the host process.picklesupport is preserved for specialized local use cases, it is now gated behind a mandatoryallow_pickle=Trueflag, ensuring users are aware of the risk when loading untrusted data.Technical Implementation Details
pyglove/core/utils/json_conversion.py:_OpaqueObject: Switched tomsgpack.packb/msgpack.unpackb.decodeto detect legacypickledata and raise aRuntimeError(Security Error) ifallow_pickle=False.pyglove/core/io/file_system.py:_check_remote_accessto validate URIs against a local-only policy.open,readfile, andwritefileto enforce this policy.pyglove/core/coding/execution.py:sandbox_callto usemsgpackfor process-to-process data transfer, converting results to JSON-safe structures first.pyglove/core/io/sequence.py:allow_remoteflag through the sequence IO layer to ensure consistent behavior inpg.open_jsonl.requirements.txt:msgpack>=1.0.0as a core dependency.Verification Performed
verify_security.py) that confirms:picklepayloads in JSON now trigger a descriptive security error instead of executing code.msgpackengine.sandbox_callsuccessfully returns results using the hardened communication channel.