Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-404: [Python] Fix segfault caused by HdfsClient getting closed before an HdfsFile #230

Closed
wants to merge 3 commits into from

Conversation

wesm
Copy link
Member

@wesm wesm commented Dec 7, 2016

The one downside of this patch is that HdfsFile handles don't get garbage-collected until the cyclic GC runs -- I tried to fix this but couldn't get it working. So bytes don't always get flushed to HDFS until close() is called. The flush issue should be addressed on the C++ side

…hen the last user-accessible client reference goes out of scope

Change-Id: If97731c7cd663d89e418b4cdd713eb8f0add49e7
Change-Id: I4cf14f47203be94a8a95f5c7d9b780734a8cec67
def __dealloc__(self):
self.file_handle.close()
# try to avoid cyclic GC
self.file_handle = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get around the cyclic GC problem by having a weak reference here to the file handle? (Disclaimer: I have only used weak references in C++, not in Python nor Cython, so I have no idea if that would really work.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, let me try

Change-Id: I9f2b03e3f9e8677c89f144ff890edc3900431a40
@wesm
Copy link
Member Author

wesm commented Dec 8, 2016

Cool that worked

Copy link
Member

@xhochy xhochy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@asfgit asfgit closed this in e139b8b Dec 9, 2016
@wesm wesm deleted the ARROW-404 branch December 12, 2016 22:24
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 2, 2018
Author: Deepak Majeti <deepak.majeti@hpe.com>

Closes apache#230 from majetideepak/PARQUET-846 and squashes the following commits:

9007986 [Deepak Majeti] PARQUET-846: CpuInfo::Init() is not thread safe
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 4, 2018
Author: Deepak Majeti <deepak.majeti@hpe.com>

Closes apache#230 from majetideepak/PARQUET-846 and squashes the following commits:

9007986 [Deepak Majeti] PARQUET-846: CpuInfo::Init() is not thread safe

Change-Id: I33bef39d6291dfe207bfc248cf7ce571f6308ad3
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 6, 2018
Author: Deepak Majeti <deepak.majeti@hpe.com>

Closes apache#230 from majetideepak/PARQUET-846 and squashes the following commits:

9007986 [Deepak Majeti] PARQUET-846: CpuInfo::Init() is not thread safe

Change-Id: I33bef39d6291dfe207bfc248cf7ce571f6308ad3
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 7, 2018
Author: Deepak Majeti <deepak.majeti@hpe.com>

Closes apache#230 from majetideepak/PARQUET-846 and squashes the following commits:

9007986 [Deepak Majeti] PARQUET-846: CpuInfo::Init() is not thread safe

Change-Id: I33bef39d6291dfe207bfc248cf7ce571f6308ad3
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 8, 2018
Author: Deepak Majeti <deepak.majeti@hpe.com>

Closes apache#230 from majetideepak/PARQUET-846 and squashes the following commits:

9007986 [Deepak Majeti] PARQUET-846: CpuInfo::Init() is not thread safe

Change-Id: I33bef39d6291dfe207bfc248cf7ce571f6308ad3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants