Trying to see if by reading texts and images from Albert offer catalog PDF, keeping a track of their coordinates and clustering them by proximity, I can scrape this PDF and extract the item price and photo associations.
node extract to generate data directory with extracted image and text bounds
and python3 -m http.server and http://localhost:8000 to view the extracted
data.
Right now the page canvas size is hard-coded to 480x840 which looks correct.
Are all of those images identified by objId masks? If so, remove them.
Provide multiple candidates for each image so that things that sit in the middle appear for both and hopefully some of the images are then excluded as masks, graphics etc.
Also maybe consider whether just simple overlap is enough to associate texts to an image in all/most cases. Text pattern matching could do the rest of the job after.
Help locate the canvas elements by hovering them when the user hovers their mouse cursor over their corresponding UI toggles.
This should help us isolate only foreground images, but it will need a threshold setting so that decorative junk doesn't remove product shot.