Skip to content

Conversation

@miguelpeixe
Copy link
Member

@miguelpeixe miguelpeixe commented Nov 28, 2024

1208510338230630-as-1208621948575042/f

Introduces the main class for the Network's content distribution with basic functionality for a post to be distributed across the network according to a specified configuration.

It uses Newspack Network implementation on top of data events for that, so it creates 2 new data events:

network_post_updated

Fires once a distributed post updates, with the following payload:

{
	"site_url": "https://node1.com",
	"post_id": 1,
	"config": {
		"enabled": true,
		"site_urls": [ "https://hub.com", "https://node2.com" ],
		"network_post_id": "1234567890abcdef1234567890abcdef"
	},
	"post_data": {
		"title": "Hello World",
		"date": "2021-01-01 00:00:00",
		"slug": "hello-world",
		"post_type": "post",
		"raw_content": "Content",
		"content": "<p>Content</p>",
		"excerpt": "Excerpt"
	}
}

$post_payload['config'] contains the required information for distribution:

  • enabled: Whether distribution is enabled
  • site_urls: Sites to distribute the post to
  • network_post_id: An md5 composed of site URL and post ID to be unique across the network

The network post ID is used to match an existing linked post to perform updates:

/**
* Get a linked post given the distributed post hash.
*
* @param string $post_type The post type.
* @param string $network_post_id The network post ID for the distributed post.
*
* @return WP_Post|null The linked post or null if not found.
*/
protected static function get_linked_post( $post_type, $network_post_id ) {
$posts = get_posts(
[
'post_type' => $post_type,
'post_status' => [ 'publish', 'pending', 'draft', 'auto-draft', 'future', 'private', 'inherit', 'trash' ],
'posts_per_page' => 1,
'meta_query' => [ // phpcs:ignore WordPress.DB.SlowDBQuery.slow_db_query_meta_query
[
'key' => self::NETWORK_POST_ID_META,
'value' => $network_post_id,
],
],
]
);
if ( empty( $posts ) ) {
return null;
}
return $posts[0];
}

network_linked_post_inserted

Fires once a linked post is updated, with the following payload:

{
	"origin": {
		"site_url": "https://node1.com",
		"post_id": 1
	},
	"destination": {
		"site_url": "https://node2.com",
		"post_id": 123,
		"is_unlinked": false
	}
}

This event is not yet used but should allow the origin site (or hub) to be aware of its linked posts across the network. This can be for UI or system purposes.

Testing

  1. Make sure you have a working Network setup with at least 1 hub and 2 nodes (let's call them hub, node1, node2)
  2. On node1, create a new post with various blocks (paragraph, image, quote, button, HPP...) and grab the post ID
  3. Open wp shell and configure the post distribution to hub and node2 by running with the following:
$post_id = 0; // Change to your post ID.
$site_urls = [ 'http://hub.local', 'http://node2.local' ]; // Change to match your setup urls.
Newspack_Network\Content_Distribution::set_post_distribution( $post_id, $site_urls );
  1. The data event is attached to wp_after_insert_post, so you can either hit "Save" on the post or trigger it manually via shell with:
Newspack_Network\Content_Distribution::distribute_post( $post_id );
  1. Once all the events are dealt with, you should see the post with the correct title, date, and content in both hub and node2. Author and taxonomies are not covered in this PR (see 1208510338230630-as-1208621948575089/f and 1208510338230630-as-1208621948575091/f)
  2. Let's unlink the post in node2 by running in node2's shell:
$linked_post_id = 0; // Change to node2's linked post ID
Newspack_Network\Content_Distribution::set_post_unlinked( $linked_post_id );
  1. In node1, make changes to the post and save
  2. Once the events are done, confirm the post has been updated in the hub but remained unchanged in node2
  3. Make changes to the post in node2 and save
  4. Relink the post in node2:
$linked_post_id = 0; // Change to node2's linked post ID
Newspack_Network\Content_Distribution::set_post_unlinked( $linked_post_id, false );
  1. Confirm the post in node2 now reflects the most recent content from the distributed post

@miguelpeixe miguelpeixe self-assigned this Dec 2, 2024
@miguelpeixe miguelpeixe marked this pull request as ready for review December 2, 2024 18:21
@miguelpeixe miguelpeixe requested a review from a team as a code owner December 2, 2024 18:21
@miguelpeixe miguelpeixe requested a review from naxoc December 2, 2024 18:21
*/
public static function init() {
add_action( 'init', [ __CLASS__, 'register_listeners' ] );
add_filter( 'newspack_webhooks_request_priority', [ __CLASS__, 'webhooks_request_priority' ], 10, 2 );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires Automattic/newspack-plugin#3587, which is currently up for review.

update_post_meta( $post_id, self::POST_UNLINKED_META, (bool) $unlinked );

// If the post is being re-linked, update content.
if ( ! $unlinked ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double negative variables are usually pretty confusing to read. "If Unlinked = false it means that yes, it's linked".. Maybe consider having a LINKED state, with straight forward boolean value that will not be confusing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explored alternatives and I'm not sure it improves readability nor maitainability. Ultimately, the post is always linked and the insert_linked_post() method is still called when "unlinked".

Having unlinked as the post meta also makes it easier to manage the different state, as we'll replace the content once the flag is removed. Having it the other way around adds some complexity that I don't think justifies the change.

Perhaps we can find a different name for unlinked that's not a negative so we can preserve the logic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can find a different name for unlinked that's not a negative so we can preserve the logic?

Yeah, that would be nice, but I can't think of anything good.

Not the end of the world, if you've explored alternatives and think this is more readable let's go with it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about standalone?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah... meh

What can help as well is if we had helper methods like set_post_as_linked ... is_post_linked, that would be shortcuts for the ...unlinked( ..., false) methods

if ( ! self::is_post_distributed( $post ) ) {
return new WP_Error( 'post_not_distributed', __( 'The post is not distributed across the network.', 'newspack-network' ) );
}
Data_Events::dispatch( 'network_post_updated', self::get_post_payload( $post ) );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe get the payload from handle_post_update so you have one single source of truth about the data payload. (If anything changes in that method, you might end up with discrepancies between the two ways this event is dispatched

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Updated in b0c0a0d. In this case, we check on $data because the dispatch call does not bail on empty data, like data event listeners.

/**
* Post meta key for the linked post containing the distributed post hash.
*/
const POST_HASH_META = 'newspack_network_post_hash';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I read post-hash I was expecting something like a hash of site-url plus post ID, so we can verify that it's the same post on both ends. Since you're only doing an ID, maybe it can be called ID? Network id? Also, you can use uniqid()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that. Changed to network_post_id using an md5 of site URL and post ID in 2d3f9ca

Copy link
Contributor

@leogermani leogermani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything works as described and the implementation is great :)

post meta is also not coming, but I guess that's still expected.

The only comment I have is that this class is geting big and multi-purpose already. For example we have the registering and handling of the data event, we have the methods to distribute a post, and we also have the methods to ingest a distributed post... I believe this would look (and grow) much better if we had a few smaller classes...

Maybe I would even create a Distributed_Post class that you could instantiate and call methods like get_processed_content(), unlink(), get_config(), etc...

@miguelpeixe
Copy link
Member Author

Thanks for the review, @leogermani!

The only comment I have is that this class is geting big and multi-purpose already.

I agree, if you look at the next PRs it's about to get bigger and I'd like to split it up. Once the other PRs are merged, I'll prep a PR solely for organization purposes, with no functional changes, for easier review.

Maybe I would even create a Distributed_Post class that you could instantiate and call methods like get_processed_content(), unlink(), get_config(), etc...

Yes, some OOP might be good here. I'll see how it looks.

@miguelpeixe miguelpeixe merged commit 4f2641a into epic/content-distribution Dec 9, 2024
@miguelpeixe miguelpeixe deleted the feat/content-distribution-class branch December 9, 2024 11:52
@naxoc
Copy link
Member

naxoc commented Dec 9, 2024

I'm a bit late to the review party, but I have a few ideas:

  • The set_post_distribution() function naming: I wonder if it would be easier to understand if it was called set_post_linked() to match the set_post_unlinked() naming?
  • In set_post_distribution() – should we validate that the site_urls are valid networked sites?

@miguelpeixe
Copy link
Member Author

Thanks for the comments, @naxoc!

The set_post_distribution() function naming: I wonder if it would be easier to understand if it was called set_post_linked() to match the set_post_unlinked() naming?

I think that would be the case if the method received the same boolean param to toggle link/unlinked. With set_post_distribution() we are configuring how the post distribution works.

From Leo's suggestion, I think it'll be better once we move these methods to a Distributed_Post with a set_config() method. WDYT?

In set_post_distribution() – should we validate that the site_urls are valid networked sites?

Good call! I know the hub is aware of all the nodes, but I'm not sure it's on node to nodes. If that check requires an API fetch, I'd rather not do it. I'll check!

@naxoc
Copy link
Member

naxoc commented Dec 9, 2024

From Leo's suggestion, I think it'll be better once we move these methods to a Distributed_Post with a set_config() method. WDYT?

I like that 👍

I wrote a validation thingy in this PR: https://github.com/Automattic/newspack-network/pull/159/files (line 70-ish) – and then thought that we should maybe centralize it. I can also just do that in the PR if you are ok with that (note that the PR is work in progress).

@miguelpeixe
Copy link
Member Author

That's a great idea! I'm working on a refactor for the OOP approach in #160. I can add support for it there so we do as you suggest, a single point of true for this check.

naxoc added a commit that referenced this pull request Dec 12, 2024
* feat: content distribution class

* feat: insert linked post

* refactor: use site url instead of ID

* chore: lint

* feat: return errors on post insert

* chore: lint

* feat: support webhooks request priority

* feat: unlink functionality and unit tests

* chore: lint

* feat: introduce 'linked_post_inserted' hook and listener

* chore: lint

* fix: listener hook name

* fix: typo

* chore: better docblocks

* chore: remove redundant code

* test: persist post hash

* test: remove unnecessary assertion

* refactor: distribute_post() to use handle_post_updated() method

* refactor: post hash is now network post id

* chore: update comment

* chore: Add CLI command for distribute

* feat: content distribution class (#155)

* feat: content distribution class

* feat: insert linked post

* refactor: use site url instead of ID

* chore: lint

* feat: return errors on post insert

* chore: lint

* feat: support webhooks request priority

* feat: unlink functionality and unit tests

* chore: lint

* feat: introduce 'linked_post_inserted' hook and listener

* chore: lint

* fix: listener hook name

* fix: typo

* chore: better docblocks

* chore: remove redundant code

* test: persist post hash

* test: remove unnecessary assertion

* refactor: distribute_post() to use handle_post_updated() method

* refactor: post hash is now network post id

* chore: update comment

* feat(content-distribution): prevent older content updating linked posts (#156)

* feat(content-distribution): handle post thumbnail (#157)

* Add network util class

* Almost done except for a few todos

* refactor: OOP for distributed and linked posts (#160)

* chore: Require php 8.1

* chore: Update to use util class

Also use new incoming/outgoing classes

* fix: Include correct classname

The old classname "Linked_Post" was still used instead of
"Incoming_Post" causing sync to fail.

## How to test
Probably eyeballing is enough

* chore: Add validation of outgoing post

Also move inclusion of the content distribution CLI class to better
respect the flag we are introdocuing

* chore: phpcs

* fix(content-distribution): debug post update and remove deprecated config (#165)

* fix: Don't return payload on all posts

* fix: post handling

* chore: move try-catch

* chore: Move check for networked urls

* Move more

* Move things again

* chore: Add a mock networked node

* chore: add one more node to the mock nodes

* Add a test

---------

Co-authored-by: Miguel Peixe <miguel.peixe@automattic.com>
matticbot pushed a commit that referenced this pull request Jan 10, 2025
# [2.4.0-alpha.1](v2.3.4...v2.4.0-alpha.1) (2025-01-10)

### Bug Fixes

* **content-distribution:** post insertion hook and additional meta for incoming post event ([#173](#173)) ([48df13c](48df13c))
* load text domain on init hook ([#171](#171)) ([01fb89c](01fb89c))

### Features

* content distribution - experimental ([#168](#168)) ([dc837d8](dc837d8))
* **content-distribution:** add CLI command for distribute post ([#159](#159)) ([7a43b86](7a43b86)), closes [#155](#155) [#156](#156) [#157](#157) [#160](#160) [#165](#165)
* **content-distribution:** canonical url ([#177](#177)) ([5ca60ce](5ca60ce))
* **content-distribution:** capability and admin page ([#176](#176)) ([5285285](5285285))
* **content-distribution:** control distribution meta and prevent multiple dispatches ([#170](#170)) ([e76a2dc](e76a2dc))
* **content-distribution:** editor plugin for distribution ([#167](#167)) ([e10aef4](e10aef4))
* **content-distribution:** handle status changes ([#166](#166)) ([4af5da1](4af5da1))
* **content-distribution:** log incoming post errors ([#182](#182)) ([74c9119](74c9119))
* **content-distribution:** posts column ([#178](#178)) ([8e07640](8e07640))
* **content-distribution:** reserved taxonomies ([#174](#174)) ([a2c54d2](a2c54d2))
* **content-distribution:** sync comment and ping statuses ([#179](#179)) ([90c5425](90c5425))
* **content-distribution:** sync post meta ([#163](#163)) ([353a3d8](353a3d8))
* **event-log:** collapse data ([#180](#180)) ([956219d](956219d))
* limit purchase of a network membership ([#169](#169)) ([deb2683](deb2683))
matticbot pushed a commit that referenced this pull request Jan 20, 2025
# [2.4.0](v2.3.4...v2.4.0) (2025-01-20)

### Bug Fixes

* **content-distribution:** post insertion hook and additional meta for incoming post event ([#173](#173)) ([48df13c](48df13c))
* load text domain on init hook ([#171](#171)) ([01fb89c](01fb89c))

### Features

* content distribution - experimental ([#168](#168)) ([dc837d8](dc837d8))
* **content-distribution:** add CLI command for distribute post ([#159](#159)) ([7a43b86](7a43b86)), closes [#155](#155) [#156](#156) [#157](#157) [#160](#160) [#165](#165)
* **content-distribution:** canonical url ([#177](#177)) ([5ca60ce](5ca60ce))
* **content-distribution:** capability and admin page ([#176](#176)) ([5285285](5285285))
* **content-distribution:** control distribution meta and prevent multiple dispatches ([#170](#170)) ([e76a2dc](e76a2dc))
* **content-distribution:** editor plugin for distribution ([#167](#167)) ([e10aef4](e10aef4))
* **content-distribution:** handle status changes ([#166](#166)) ([4af5da1](4af5da1))
* **content-distribution:** log incoming post errors ([#182](#182)) ([74c9119](74c9119))
* **content-distribution:** posts column ([#178](#178)) ([8e07640](8e07640))
* **content-distribution:** reserved taxonomies ([#174](#174)) ([a2c54d2](a2c54d2))
* **content-distribution:** sync comment and ping statuses ([#179](#179)) ([90c5425](90c5425))
* **content-distribution:** sync post meta ([#163](#163)) ([353a3d8](353a3d8))
* **event-log:** collapse data ([#180](#180)) ([956219d](956219d))
* limit purchase of a network membership ([#169](#169)) ([deb2683](deb2683))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants