Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IFilesMetadata #40761

Merged
merged 3 commits into from
Nov 7, 2023
Merged

IFilesMetadata #40761

merged 3 commits into from
Nov 7, 2023

Conversation

ArtificialOwl
Copy link
Member

@ArtificialOwl ArtificialOwl commented Oct 4, 2023

#40676

Introduction

This is a feature that allow any app to store files metadata.

When a file is created or modified, an event is broadcasted so that a listener can:

  • request for an identical event, but on the background job process
  • set metadata, key and value
  • define metadata index
  • help building queries to search on an indexed metadata
  • help building queries that returns metadata linked to fileid

tasks

  • model Metadata
    • store type of metadata (string, int, bool, array, float)
    • serialize/deserialize and import from database
  • store metadata,
    • update database only if new data (or truly edited) is available
    • avoid parallel update in database
    • delete metadata based on key
    • delete metadata based on prefix
  • index metadata (file id, metadata key, metadata value)
    • generate entries in metadata_index based on value type
  • queries
    • helper to left join metadata based on file_id
    • helper to search on metadata key
    • helper to search on metadata value
    • helper to get metadata key and metadata value table field
    • helper to convert row's data to metadata object
  • events
    • listener on new file
    • listener on edited file
    • on request, dispatch event on background job process
    • occ metadata:get

How to add metadata from your own app:

lib/AppInfo/Application.php


		$context->registerEventListener(MetadataLiveEvent::class, UpdateFilesMetadata::class);
		$context->registerEventListener(MetadataBackgroundEvent::class, UpdateFilesMetadata::class);

UpdateFilesMetadata.php

use OCP\EventDispatcher\Event;
use OCP\EventDispatcher\IEventListener;
use OCP\FilesMetadata\Event\MetadataBackgroundEvent;
use OCP\FilesMetadata\Event\MetadataLiveEvent;

class UpdateFilesMetadata implements IEventListener {
	public function __construct() {}

	public function handle(Event $event): void {
		if (!($event instanceof MetadataLiveEvent) &&
			!($event instanceof MetadataBackgroundEvent)) {
			return;
		}

		$node = $event->getNode();
		// only few files are interesting.
		if ($node->getMimetype() !== 'image/jpeg') {
			return;
		}

		// we only work on process initiated as background job, so ask for a re-run later
		if ($event instanceof MetadataLiveEvent) {
			$event->requestBackgroundJob();
			return;
		}

		$fd = $node->fopen('rb');
		if (false === $fd) {
			return;
		}

		try {
			$oldBufferSize = stream_set_chunk_size($fd, 1);
			$data = @exif_read_data($fd, 'ANY_TAG', true);
			stream_set_chunk_size($fd, $oldBufferSize);

                        // we have exif data in $data, let's index the photo taken date 
			$metadata = $event->getMetadata();
			$metadata->setInt('photo-taken', strtotime($data['EXIF']['DateTimeOriginal']), true);
		} catch (\Exception $e) {
		}
	}
}

How to use the (current) QueryHelper

		/** @var ConnectionAdapter $conn */
		$conn = \OCP\Server::get(ConnectionAdapter::class);
		$qb = $conn->getQueryBuilder();

		$qb->select('file.fileid', 'file.path')->from('filecache', 'file');

                // linking metadataQuery to existing query builder
		$metadataQuery = $this->filesMetadataManager->getMetadataQuery($qb, 'file', 'fileid');
                // add metadata to the request's results
		$metadataQuery->retrieveMetadata();
                // only returns result havin
		$metadataQuery->joinIndex('another');

		$expr = $qb->expr();
		$qb->where(
			$expr->lte(
				$metadataQuery->getMetadataValueIntField(),
				$qb->createNamedParameter(1, IQueryBuilder::PARAM_INT)
			)
		);
		$qb->orderBy($metadataQuery->getMetadataValueIntField(), 'desc');

@ArtificialOwl ArtificialOwl force-pushed the enh/noid/files-metadata branch 2 times, most recently from 376c918 to 40a04a6 Compare October 16, 2023 23:17
@skjnldsv skjnldsv added the 2. developing Work in progress label Oct 17, 2023
@ChristophWurst ChristophWurst added the pending documentation This pull request needs an associated documentation update label Oct 17, 2023
@ArtificialOwl ArtificialOwl changed the title [wip] [poc] FilesMetadata IFilesMetadata Oct 26, 2023
@ArtificialOwl ArtificialOwl marked this pull request as ready for review October 26, 2023 13:29
@ArtificialOwl ArtificialOwl self-assigned this Oct 26, 2023
@ArtificialOwl ArtificialOwl added this to the Nextcloud 28 milestone Oct 26, 2023
}

if ($input->getOption('as-array')) {
$output->writeln(json_encode($metadata->asArray(), JSON_PRETTY_PRINT));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON_THROW_ON_ERROR could be added for all JSON function calls.

@ArtificialOwl ArtificialOwl force-pushed the enh/noid/files-metadata branch 4 times, most recently from 1065e8a to 2fc9ba3 Compare November 6, 2023 15:48
Copy link
Member

@icewind1991 icewind1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that the metadata is only loaded on search requests atm?

I don't see any logic to trigger the loading for propfinds

Comment on lines 59 to 75

/**
* extra means data are not related to the main files table
*
* @return string
* @since 28.0.0
*/
public function getExtra(): string;

/**
* returns if data are 'extra' or not
*
* @return bool
* @since 28.0.0
*/
public function isExtra(): bool;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be part of the public api?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might not be necessary

@ArtificialOwl
Copy link
Member Author

ArtificialOwl commented Nov 6, 2023

Do I understand correctly that the metadata is only loaded on search requests atm?

GET request might wants to get their metadata, we could also imagine to use this new API to manage tags
and metadata will also be updated from iOS client (another PR is on its way)

I don't see any logic to trigger the loading for propfinds

If we don't load known indexed metadata, exceptions are thrown by https://github.com/nextcloud/3rdparty/blob/e6c45c6c0d4f92ffec621446fcc3b34584772b13/icewind/searchdav/src/DAV/SearchHandler.php#L102-L128

We could edit the code there to accept 'nc:metadata-*', but not sure this is something we want ?

@blizzz blizzz mentioned this pull request Nov 6, 2023
Signed-off-by: Maxence Lange <maxence@artificial-owl.com>
Copy link
Contributor

@come-nc come-nc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3000 lines /o\

Signed-off-by: Maxence Lange <maxence@artificial-owl.com>
Signed-off-by: Maxence Lange <maxence@artificial-owl.com>
@ArtificialOwl ArtificialOwl merged commit d9c24f6 into master Nov 7, 2023
50 checks passed
@ArtificialOwl ArtificialOwl deleted the enh/noid/files-metadata branch November 7, 2023 16:55
@nickvergessen
Copy link
Member

This PR has no version bump and therefore breaks all master setups because while the table is used, the migration to create it did not trigger.

@nickvergessen
Copy link
Member

Follow up fix in #41351

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2. developing Work in progress enhancement pending documentation This pull request needs an associated documentation update ❄️ 2023-Winter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants