Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

68 failed tests when installing from sdist on my box #2293

Closed
camaradadennis opened this issue Mar 17, 2023 · 9 comments
Closed

68 failed tests when installing from sdist on my box #2293

camaradadennis opened this issue Mar 17, 2023 · 9 comments

Comments

@camaradadennis
Copy link

Describe the bug (mandatory)

Trying to run the test suite after installing pymupdf from source returned 68 failed tests. All of them fail with an AttributeError exception. All with one of the two messages bellow.

"AttributeError: 'SwigPyObject' object has no attribute 'thisown'"
"AttributeError: 'SwigPyObject' object has no attribute 'bodytag'"

At the end, there is also these lines.
"swig/python detected a memory leak of type 'Annot *', no destructor found.
"swig/python detected a memory leak of type 'Outline *', no destructor found.

To Reproduce (mandatory)

Install pymupdf with pip, copy the tests/ directory from the source tarball.

From a virtual environment:
$ pip install --no-binary :all: pymupdf
$ pytest -k 'not test_textbox3' tests/

Your configuration (mandatory)

I'm running Gentoo on a amd64.
GNU C compiler version: 12.2.1
Swig version: 4.1.1

PyMuPDF 1.21.1: Python bindings for the MuPDF 1.21.1 library.
Version date: 2022-12-13 00:00:01
Built for Python 3.10 on linux (64-bit).

With pymupdf installed from the binary distribution, all tests pass on this box.

@julian-smith-artifex-com
Copy link
Collaborator

Interestingly enough i started seeing similar errors on a particular Linux system two days ago, sometimes instead resulting in SEGVs.

It looks like there's a refcount bug in reload_page() which can cause objects to be freed incorrectly, which might be connected. Here's a crude diff that fixes things (including the associated valgrind error) for me with latest PyMuPDF 1.21 branch, and both mupdf-1.21.x and mupdf-master:

diff --git a/fitz/fitz.i b/fitz/fitz.i
index 703c6c1..3ee7dd4 100644
--- a/fitz/fitz.i
+++ b/fitz/fitz.i
@@ -170,6 +169,8 @@ static PyObject *JM_Exc_FileDataError;
 static PyObject *JM_Exc_CurrentException;
 %}
 
 //------------------------------------------------------------------------
 // global context
 //------------------------------------------------------------------------
@@ -4300,6 +4301,19 @@ if basestate:
             return Py_BuildValue("i", xref);
         }
 
+        void internal_keep_annot(void* annot)
+        {
+            pdf_annot* annot2 = annot;
+            pdf_keep_annot(gctx, annot2);
+        }
 
         //------------------------------------------------------------------
         // Initialize document: set outline and metadata properties
@@ -4570,6 +4584,11 @@ if basestate:
                 old_annots = {}  # copy annot references to here
                 pno = page.number  # save the page number
                 for k, v in page._annot_refs.items():  # save the annot dictionary
+                    # We need to call pdf_keep_annot() here, otherwise `v`'s
+                    # refcount can reach zero even if there is an external
+                    # reference.
+                    self.internal_keep_annot(v)
                     old_annots[k] = v
                 page._erase()  # remove the page
                 page = None

This is very much work in progress, but if you are able to test the above patch, it would be good to know whether it makes a difference for you.

@camaradadennis
Copy link
Author

I've applied this patch to the 1.21 branch. Ran python setup.py sdist and then pip installed the tarball in a venv.
Now tests don't fail everytime. But sometimes (more often than not) get aborted in test_polyline().

Running pytest -k 'not test_polyline' tests/ sometimes fails in test_line(). Deselecting these two tests fail in test_stamp(). And so on.

pytest output is attatched.

tests.log

@julian-smith-artifex-com
Copy link
Collaborator

I've just built latest PyMuPDF (1.21 branch) with the reload_page() patch described above, and default MuPDF (hard-coded in setup.py as https://mupdf.com/downloads/archive/mupdf-1.21.1-source.tar.gz) on Linux in a venv, and run tests under valgrind with:

PYTHONMALLOC=malloc valgrind python -m pytest PyMuPDF/tests

All 123 tests pass for me.

Valgrind is clean apart from 5 warnings about "Conditional jump or move depends on uninitialised value(s)" in MuPDF Story code; these are fixed (as of about an hour ago) in latest MuPDF (branch master).

To build and install PyMuPDF in the venv i did python setup.py install; i don't think this is significantly different from going via an sdist.

The only thing i can think of is that somehow an existing PyMuPDF installation is still being used in your venv, but that seems unlikely.

@julian-smith-artifex-com
Copy link
Collaborator

julian-smith-artifex-com commented Mar 20, 2023

Ah, apologies, i've just realised that i have another fix in my tree:

diff --git a/fitz/fitz.i b/fitz/fitz.i
index 3ee7dd4..71e49b2 100644
--- a/fitz/fitz.i
+++ b/fitz/fitz.i
@@ -6744,6 +6744,7 @@ def get_oc_items(self) -> list:
             val.thisown = True
             val.parent = weakref.proxy(self) # owning page object
             val.parent._annot_refs[id(val)] = val
+        annot._erase()
         %}
 
         struct Annot *delete_annot(struct Annot *annot)

Without this, i get the same SEGV in test_polyline().

Hopefully this additional fix will resolve the problem for you. I'll push both fixes to the central 1.21 branch in the next day or two.

@camaradadennis
Copy link
Author

I have two separate trees here. One extracted from a release tarball, PyMuPDF-1.21.1 and another tracking the 1.21 branch from github. This second patch you sent, with the annot._erase() call, is already on both.

I'm making sure that the environments that run the tests are separated, and I also don't have any other installation of pymupdf lingering.

I've realized that the 1.21 branch tests fail like noted above regardless of the changes introduced in the first patch. That is: sometimes they all pass, sometimes the python interpreter crashes. The crashes occur when the program calls _update_appearance() from the extension module. This happens if I build the 1.21 branch with or without those changes.

I've tried to determine whether this behaviour is influenced by generated test files lingering on the tests and resources folders. But even running the tests with those folders cleaned sometimes fail.

@julian-smith-artifex-com
Copy link
Collaborator

Thanks for the full report.

I've just pushed the two fixes to the PyMuPDF-1.21 branch, plus another change to fix a leak of TextPage instances.

Yesterday i started testing with valgrind; pytest passes all tests with no valgrind errors. [There were some memory leaks, but i don't think they are relevant here.]

The command line i used is:

PYTHONMALLOC=malloc valgrind --fullpath-after=  python -m pytest -s -vv PyMuPDF

I've run this test building current PyMuPDF-1.21 with current MuPDF master, and also with setup.py's hard-coded MuPDF-1.21.1 release.

I was experiencing similar problems to you a few days ago, and i'm not sure whether my commits have fixed anything or whether something in my environment has changed. But having a clean valgrind run seems fairly significant here.

The only thing i can suggest is that you try making a completely clean checkout of PyMuPDF-1.21, and a similarly clean venv, MuPDF checkout, etc. But i think you've already done essentially that so that might not be very helpful.

Perhaps you could try running with valgrind, in the hope that it might give more information about what's going wrong on your system?

@julian-smith-artifex-com
Copy link
Collaborator

Oh, one thing that might be useful - build PyMuPDF with PYMUPDF_SETUP_MUPDF_BUILD_TYPE=debug so valgrind will see debug info and show file:line in backtraces.

@camaradadennis
Copy link
Author

Thank you!

I've just pulled the latest 1.21 and built it from a clean tree. Ran the tests a bunch of times and there are no more crashes.

I will note that commit e1b6e5c actually removed the annot._erase() call from Annot.delete_annot(). I was under the impression, from the last diff you sent, that this call should be there, since is marked with the + sign. So maybe thist was the source of the crashes.

Nice work! Thank you

@julian-smith-artifex-com
Copy link
Collaborator

Marvelous, very glad it's working for you now.

Apologies for the reversed diff that i posted, i suspect you're right and that surely caused a lot of the confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants