Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Performer Image by scene cover scraper #1039

Merged
merged 4 commits into from
Sep 27, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions SCRAPERS-LIST.md
Original file line number Diff line number Diff line change
Expand Up @@ -1344,6 +1344,7 @@ dc-onlyfans.yml| An Onlyfans DB scene scraper | A python scraper that scrapes On
jellyfin.yml| A Jellyfin/Emby scraper | A python scraper that uses the Jellyfin/Emby API to look for Scenes, Performers and Movies via URL, Query or Fragments. Needs the URL, API-Key and User from Jellyfin set in jellyfin.py and the URLs in jellyfin.yml adopted to your local Jelly/Emby Instance |
MindGeekAPI.yml| A sceneBy(Name\|Fragment) scraper for MindGeek network| A python scraper that queries directly the MindGeek API. For further **needed** instructions refer to the relevant PRs and have a look in the `MindGeekApi.py` file | [#711](https://github.com/stashapp/CommunityScrapers/pull/711) [#738](https://github.com/stashapp/CommunityScrapers/pull/738) [#411](https://github.com/stashapp/CommunityScrapers/pull/411)
multiscrape.yml| A performer scraper that can utilize multiple stash Performer scrapers| A python scraper that can use multiple existing performer scrapers in order to get performer meta. To configure it edit the `multiscrape.py` file|[#594](https://github.com/stashapp/CommunityScrapers/pull/594)
performer-image-by-scene.yml| A performer image scraper that gets images from scene covers | A python scraper that searches for scenes with the performer and sets the scene cover image as the performer image|[#1039](https://github.com/stashapp/CommunityScrapers/pull/1039)
performer-image-dir.yml| A performer image scraper compatible with the actress-pics repo | A python scraper that searches in a cloned actress-pics repo for performer images. Configuration and more info in `performer-image-dir.py`|[#453](https://github.com/stashapp/CommunityScrapers/pull/453)
ScrapeWithURL.yml| A sceneByFragment scraper to perform a sceneByURL scape on scenes with URLs provided | This scraper allows users to perform sceneByURL scrapes in bulk.| [#900](https://github.com/stashapp/CommunityScrapers/issues/900)
ShokoAPI.yml| A sceneByFragment scraper for [Shoko Server](https://shokoanime.com) | A sceneByFragment scraper that queries a local Shoko Server instance using the filename for scene meta. To configure it edit the `ShokoAPI.py` file| [#586](https://github.com/stashapp/CommunityScrapers/issues/586) [#628](https://github.com/stashapp/CommunityScrapers/pull/628)
Expand Down
113 changes: 113 additions & 0 deletions scrapers/performer-image-by-scene.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
import json
import re
import sys
from pathlib import Path

try:
from py_common import log
from py_common import graphql
except ModuleNotFoundError:
print(
"You need to download the folder 'py_common' from the community repo (CommunityScrapers/tree/master/scrapers/py_common)",
file=sys.stderr
)
sys.exit()

MAX_TITLE_LENGTH = 25


def announce_result_to_stash(result):
if result is None:
result = [] if 'query' in sys.argv else {}
if 'query' in sys.argv:
if isinstance(result, list):
print(json.dumps(result))
sys.exit(0)
else:
print(json.dumps([result]))
sys.exit(0)
else:
if isinstance(result, list):
if len(result) > 0:
print(json.dumps(result[0]))
sys.exit(0)
else:
print("{}")
sys.exit(0)
else:
print(json.dumps(result))
sys.exit(0)


# Allows us to simply debug the script via CLI args
if len(sys.argv) > 2 and '-d' in sys.argv:
stdin = sys.argv[sys.argv.index('-d') + 1]
else:
stdin = sys.stdin.read()

frag = json.loads(stdin)
performer_name = frag.get("name")
if performer_name is None:
announce_result_to_stash(None)
else:
performer_name = str(performer_name)

regex_obj_parse_name_with_scene = re.compile(
r"(.*?) - Scene (\d+)\. (.*)", re.IGNORECASE | re.MULTILINE)

parsed_name = regex_obj_parse_name_with_scene.search(performer_name)


if parsed_name:
# scene id already available, get scene directly
performer_name = parsed_name.group(1)
scene_id = parsed_name.group(2)
log.debug(f"Using scene {scene_id} to get performer image")
performer_scene = graphql.getSceneScreenshot(scene_id)
performer = {'Name': performer_name,
'Image': performer_scene['paths']['screenshot'],
'Images': [performer_scene['paths']['screenshot']]}
announce_result_to_stash(performer)
else:
# search for scenes with the performer

# first find the id of the performer
performers_data = graphql.getPerformersIdByName(performer_name)
performer_data = None
if performers_data is None or performers_data['count'] < 1:
announce_result_to_stash(None)
elif performers_data['count'] > 1:
for performers_data_element in performers_data['performers']:
if str(performers_data_element['name']).lower().strip() == performer_name.lower().strip():
performer_data = performers_data_element
break
if performer_data is None:
# No match found by looking into the names, let's loop again and match with the aliases
for performers_data_element in performers_data['performers']:
if performer_name.lower().strip() in str(performers_data_element['aliases']).lower().strip():
performer_data = performers_data_element
break
else:
performer_data = performers_data['performers'][0]

if performer_data is None or 'id' not in performer_data or int(performer_data['id']) < 0:
announce_result_to_stash(None)

# get all scenes with the performer
performer_scenes = graphql.getSceneIdByPerformerId(performer_data['id'])

image_candidates = []
for scene in performer_scenes['scenes']:
if 'paths' in scene and 'screenshot' in scene['paths'] and len(scene['paths']['screenshot']) > 0:
if 'query' in sys.argv:
scene_title = scene.get("title")
if scene_title is None:
scene_title = Path(scene["path"]).name
image_candidates.append(
{
'Name': f'{performer_name} - Scene {scene["id"]}. {scene_title[0:MAX_TITLE_LENGTH]}',
'Image': scene['paths']['screenshot'],
'Images': [scene['paths']['screenshot']]
}
)
announce_result_to_stash(image_candidates)
17 changes: 17 additions & 0 deletions scrapers/performer-image-by-scene.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: Performer Image by scene cover

performerByFragment:
action: script
script:
- python
- performer-image-by-scene.py
- fetch

performerByName:
action: script
script:
- python
- performer-image-by-scene.py
- query

# Last Updated June 25, 2022
Loading