Greader API : get modified items #2566

Shinokuni · 2019-10-12T11:15:18Z

Hello,

I am writing an Android RSS client with FreshRSS support and I encountered a problem when syncing items using Greader API.

I would like to get from the server items which read state has been modified. Let's say I synchronize my items list with my Android client. Then, I mark some items as read in the web client. I would like to make notice the mobile client that several items have been marked as read.

I already know that if an item contains user/-/state/com.google/read in the categories field in the returned json, like that :

"categories": [
    "user/-/state/com.google/reading-list",  
    "user/-/label/IT",
    "user/-/state/com.google/read"
]

it is read an I can mark it as read in the mobile client database.

When fetching items with reader/api/0/stream/contents/user/-/state/com.google/reading-list end point and the parameter ot (unix timestamp), I can get new items, and, if they have been read before the mobile client fetch them, they will have user/-/state/com.google/read in their categories field. But if I mark an item older than the ot parameter as read, I won't get it.

Am I missing something which could solve this problem ? If not, Is there anything doable for this ?

Otherwise, thanks for this awesome project that is FreshRSS !

The text was updated successfully, but these errors were encountered:

Alkarex · 2019-10-12T11:53:22Z

Hello @Shinokuni and welcome :-)

First, I would like to say that getting the synchronisation strategy right is essential for a good client. Except News+ and to a lower extent EasyRSS, the other clients I have tested all have inefficient synchronisation strategies (in some cases very bad). By inefficient, I mean far too many requests, redundant requests, as well as expensive requests for the client and/or the server (leading to slow synchronisation, high battery consumption, high bandwidth consumption, high CPU usage on client and server, high database usage on server, etc.).

Therefore, I am always very pleased to provide the exact API calls to perform.
The following seven requests are what News+ does for its global synchronisation (see also full log below), which is both robust and efficient. No need to make a single additional request for that phase. I can also provide logs for other phases such as login, posting changes, etc. In case of doubt, I suggest you install News+ and check on your server the exact calls that are made, and do the same.

/reader/api/0/tag/list
- Full list of categories/folders and tags/labels - and for InnoReader compatibility, including the number of unread items in each tags/labels
/reader/api/0/subscription/list
- Full list of subscriptions/feeds, including their category/folder.
- This is where you get a distinction between categories/folders and tags/labels
/reader/api/0/stream/contents/user/-/state/com.google/reading-list (with some filters in parameter to exclude read items with xt, and get only the new ones with ot, cf. log below)
- List of new unread items and their content
- The response contains among other things the read/unread state, the starred/not-starred state, and the tags/labels for each entry.
- Since this request is very expensive for the client, the network, and the server, it is important to use the filters appropriately.
- If there is no new item since the last synchronisation, the response should be empty, and therefore efficient
/reader/api/0/stream/items/ids (with a filter in parameter to exclude read items with xt)
- Longer list of unread items IDs
- This allows updating the read/unread status of the local cache of articles - assuming the ones not in the list are read
/reader/api/0/stream/contents/user/-/state/com.google/starred (with some filters in parameter to exclude read items with xt, and get only the new ones with ot)
- List of new unread starred items and their content
- If there is no new unread starred item since the last synchronisation, the response should be empty, and therefore efficient
- This is a bit redundant with request 3 and 6, but with the advantage of being able to retrieve a larger amount of unread starred items.
/reader/api/0/stream/contents/user/-/state/com.google/starred (with some other filters, which includes read starred items)
- List of starred items (also read ones) and their content
/reader/api/0/stream/items/ids (with a filter to get only starred ones)
- Longer list of starred items IDs
- This allows updating the starred/non-starred status of the local cache of articles - assuming the ones not in the list are not starred
- Similar than request 4 but for the starred status

It is also possible in News+ to synchronise / "pull for refresh" a specific category/folder, or feed, or tag/label, but that is only necessary when the user wants to get read items or more/older items than the global limit.

Full log:

Do not hesitate to ask again, but please consider this synchronisation strategy.

Frenzie · 2019-10-12T13:06:47Z

@Alkarex Sounds like a good thing to stick in the docs which should make it easier to find through a search engine, maybe somewhere in https://freshrss.github.io/FreshRSS/en/developers/01_First_steps.html?

Shinokuni · 2019-10-12T14:35:56Z

First of all, thank you for your answer.

Here is the way Readrops handle synchronization.

Initial sync

One the main functionalities of Readrops is to provide an offline experience. Therefore, a large quantity of items is fetched and stored locally when doing the initial synchronization.

Steps :

Fetch feeds /reader/api/0/subscription/list
Fetch folders /reader/api/0//tag/list
Fetch only unread items to a maximum of 10k reader/api/0/stream/contents/user/-/state/com.google/reading-list

Classic sync

Steps :

Push read items /reader/api/0/edit-tag
Push unread items /reader/api/0/edit-tag
Fetch feeds, to get new and updated feeds and know which feeds were deleted /reader/api/0/subscription/list
Fetch folders, the same as for feeds /reader/api/0//tag/list
Fetch new unread items since last synchronization reader/api/0/stream/contents/user/-/state/com.google/reading-list

The way Readrops handles synchronization is more or less the same as what you described expect Readrops doesn't fetch starred items and makes one query per item read state.

The initial point of my issue was to know if there is a way to get modified items since a precise time. This would allow to have a coherent read/unread state for all items and all platforms.

Do I have to conclude that there is no way to do this ?

Frenzie · 2019-10-12T14:43:15Z

One the main functionalities of Readrops is to provide an offline experience.

Speaking just for myself of course, but I doubt I'd even consider using a third-party client except for the offline experience. ;-)

(That's why I currently use EasyRSS.)

Shinokuni · 2019-10-12T14:54:30Z

Speaking just for myself of course, but I doubt I'd even consider using a third-party client except for the offline experience. ;-)

(That's why I currently use EasyRSS.)

Year, I believe too that it is important to have an offline access to its feeds. Personally, not having an offline access wouldn't bother me that much because the situations where I don't have any connexion (RER A de ses morts) are infrequent and I can do something else.

Alkarex · 2019-10-12T17:47:51Z

makes one query per item read state

@Shinokuni Could you please explain that again?

Please check requests 4 and 7.

Shinokuni · 2019-10-12T19:19:51Z

makes one query per item read state

@Shinokuni Could you please explain that again?

Year, sorry. I meant one request to mark items as read and one request to mark items as unread with /reader/api/0/edit-tag.

Please check requests 4 and 7.

This is interesting. If I use the parameter ot (newer than), do you know if I will get the ids of the latest modified items or only the latest items ? Using ot would allow to avoid fetching an arbitrary number of ids to get all modified items ids, just the new and modified items ids.

Alkarex · 2019-10-12T20:17:36Z

No, it is not the date when the items where modified, but the date when they were discovered / added to database. They are still the best calls to get the states as they only retrieve IDs.

Shinokuni · 2019-10-12T20:24:32Z

Does this mean that if I change the read state of a item created three months ago, I will have to fetch three months of items ids to get it ? In this case, it won't be useful because too expensive.

Alkarex · 2019-10-12T22:51:19Z

I agree that the API could surely be improved. We could make some additions (I am open to that), but changing the behaviour of existing calls risk breaking other clients obeying the Google Reader API.
In any case, there are many more items on the server than on the client, so the client need to make reasonable calls.

When you want the states and ask the IDs, you ask only the unread ones (The IDs not in the list are read). The length of that list is at max the number of unread articles on the server, and can be limited by number and date, so it is not that bad.

In practice, I have in general between 1k and 4k unread articles, 300k+ read articles, ~160 feeds, ~17 categories, 400+ favourites, ~10 tags. A full sync in News+ takes about ~3s.

Shinokuni · 2019-10-13T12:06:53Z

When you want the states and ask the IDs, you ask only the unread ones (The IDs not in the list are read). The length of that list is at max the number of unread articles on the server, and can be limited by number and date, so it is not that bad.

You are right, it is not that bad. But I will still see for a limitation to avoid fetching all unread items ids. I have with my personal account 4k unread items, so 4k local items to update if I fetch all of them. I don't mind when doing the initial sync, but for a classic one, it is not insignificant.

I agree that the API could surely be improved. We could make some additions (I am open to that)

That's nice !
I see here two cases to handle read state synchronization :

return new and modified items with reader/api/0/stream/contents/user/-/state/com.google/reading-list. The items list would be sorted by last modified date, insertion date being the first last modified time. This change could break existing API implementations by returning all ready existing in local, items. If the client doesn't have any kind of upsert strategy, this will create duplicates.
return new and modified items ids with /reader/api/0/stream/items/ids. Apply the same strategy as the first point. This wouldn't break anything because only the order would be modified.

Anyway, a big thanks for taking the time to answer me. I will investigate the /reader/api/0/stream/items/ids solution.

Alkarex · 2020-02-29T13:10:39Z

@Shinokuni I have tested your client today, and it looks very good already, congrats :-)
#2798
Closing here, but do not hesitate to ask again, especially if you need any documentation / feedback

Shinokuni · 2020-12-28T11:50:05Z

Hello, as promised in readrops/Readrops#53 (comment), here is a post about FreshRSS synchronization in Readrops. Due to a lack of time, I wasn't able to make it sooner.

Issues

I recently (more or less) worked on the addition of FreshRSS starred items in Readrops and it made me work again on item read state synchronization. If I didn't really encounter problems for managing requests, it was on the other hand more difficult on the db side.

First, SQLite restricts to 999 the number of arguments you can give to an IN operator. It means that when you get more than a thousand items ids with /reader/api/0/stream/items/ids, you will have to split them and make multiple requests to update the state of each item, which would be really slow. This also affects the star state synchronization.

D/FreshRSSRepository: FreshRSS sync timer:      704 ms, server queries
D/FreshRSSRepository: FreshRSS sync timer:      9 ms, folders insertion
D/FreshRSSRepository: FreshRSS sync timer:      84 ms, feeds insertion
D/FreshRSSRepository: FreshRSS sync timer:      0 ms, items insertion
D/FreshRSSRepository: FreshRSS sync timer:      495 ms, starred items insertion
D/FreshRSSRepository: FreshRSS sync timer:      528 ms, update starred items state
D/FreshRSSRepository: FreshRSS sync timer:      1071 ms, reset read changes
D/FreshRSSRepository: FreshRSS sync timer:      2322 ms, reset star changes
D/FreshRSSRepository: FreshRSS sync timer: end, 5213 ms

Here is a log of the synchronization after I implemented the fetch of the starred items. The starred items insertion checks for each item if it already exists in db and inserts it if not, which is pretty slow as the test data was about 10 starred items. Then items star state is updated with the ids from /reader/api/0/stream/items/ids which is also very slow. Finally, local read/star state which indicates if an item had one of these states modified is reset.

The synchronization doesn't contains the update of the read state and doesn't fetch any new items but lasts 5 seconds which is way too much. I had to improve all of this.

Solution

Requests strategy

Here is the new request strategy:

folders: reader/api/0/tag/list
feeds: reader/api/0/subscription/list
new items: reader/api/0/stream/contents/user/-/state/com.google/reading-list
- exclude starred items
- only the new ones
unread items ids: reader/api/0/stream/items/ids
- only unread and not starred items ids
- only the 5000 newest
starred items: reader/api/0/stream/contents/user/-/state/com.google/starred
- all starred items (read/unread)
- only the 1000 newest

I don't make any further calls as it's not needed with the database strategy below.

Database strategy

I added a few new tables:

A table to store unread items ids from /reader/api/0/stream/items/ids
A table to store starred items from reader/api/0/stream/contents/user/-/state/com.google/starred

Instead of directly updating each item read state with the ids (limited to 999) in the query, all unread items ids are stored in a new table and then used to update the read state. Before inserting unread items ids in the new table, all old items ids from the previous synchronization are deleted. This process makes the update faster even if it's not perfect.

Instead of dealing with starred items ids, only starred items are fetched and stored in a separate table. This ensures not to have to do any request and query to update starred items read and star state. Before the insertion, all previous inserted items from the last synchronization are deleted. The fetch of starred items is limited to 1000 for performance.

Result

D/FreshRSSRepository: FreshRSS sync timer:      530 ms, server queries
D/FreshRSSRepository: FreshRSS sync timer:      10 ms, folders insertion
D/FreshRSSRepository: FreshRSS sync timer:      72 ms, feeds insertion
D/FreshRSSRepository: FreshRSS sync timer:      0 ms, items insertion
D/FreshRSSRepository: FreshRSS sync timer:      7 ms, starred items insertion
D/FreshRSSRepository: FreshRSS sync timer:      760 ms, insert and update items ids
D/FreshRSSRepository: FreshRSS sync timer: end, 1384 ms

The result is a lot better. Some steps were removed and other improved. Of course, this result was made with good conditions: fast WI-FI connection, fast phone (OP 6), no new items and very few starred items. A synchronization with less good variables would last around 3 seconds.

Things left to do

I didn't write anything about pushing read/star state changes from Readrops. I have two solutions here:

Create a new table which will store all state changes. The changes will be pushed when synchronizing.
Push the update just after a change. For example, when a user clicks on an item, a request is made to push the read state change. This has several drawbacks: a lot of tiny requests which could be unified in a single one and if internet isn't available, we would need to store somewhere the change to be able to push it when internet is back, otherwise the change would never reach the server

Feel free to suggest changes, I'm totally open to modifications.

Alkarex · 2020-12-28T13:00:50Z

@Shinokuni Thanks for the update; that looks very good, congrats 👍
Regarding pushing state changes, I suggest a hybrid approach: you need to maintain a list of state changes anyway (e.g. to change state of multiple articles at once, in the case a request does not work, telephone offline, etc.), and pushing regularly (when synchronising, but also at significant events, for instance changing view, or before closing the app - if you can catch that-).
Keep up the good work!

Shinokuni · 2020-12-28T14:29:07Z

Thanks for the suggestion, I will think about it!

Alkarex added API 🤝 API for other clients Documentation 📚 labels Oct 12, 2019

Alkarex mentioned this issue Dec 30, 2019

Reeder not working with FreshRSS greader API anymore #2684

Closed

Alkarex added this to the 1.16.0 milestone Feb 29, 2020

Alkarex closed this as completed Feb 29, 2020

barijaona mentioned this issue Mar 22, 2020

[GReader API] feed ids provided by 'unread-count' do not follow the standard #2834

Closed

Alkarex mentioned this issue Mar 22, 2020

[GReader API] Return the unread articles count for the unlisted feed. #2839

Closed

Alkarex mentioned this issue Apr 24, 2020

FreshRSS: Read articles switch back to unread Ranchero-Software/NetNewsWire#1519

Closed

This was referenced May 2, 2020

[fever-api] mixed string and number in time #2940

Closed

Table overview of compatible clients (Help welcome) #2942

Merged

Alkarex mentioned this issue Jul 14, 2020

[greader-api] feed ids #3107

Closed

Alkarex mentioned this issue Aug 18, 2020

希望支持 inoreader yang991178/fluent-reader#10

Closed

vincode-io mentioned this issue Oct 20, 2020

FreshRSS: Rework refreshAll to be more efficient Ranchero-Software/NetNewsWire#2508

Closed

Alkarex mentioned this issue Nov 2, 2020

Sync read later / starred articles readrops/Readrops#53

Closed

Frenzie mentioned this issue Jan 29, 2021

Greader API: "edit-tag" method does not seem to work. #3400

Open

Frenzie mentioned this issue Jul 29, 2021

[BUG] greader.php : Allowed memory size exhausted (HTTP error 500) #3736

Closed

Massedil mentioned this issue Jul 29, 2021

Readrops asks for too much articles, overloads the server and no article is retrieved readrops/Readrops#116

Open

This was referenced Sep 10, 2022

Update README.md #3859

Merged

Update README.fr.md #3860

Merged

Alkarex mentioned this issue Nov 19, 2023

RSS Reader Api Support in newsfeed module MagicMirrorOrg/MagicMirror#3263

Open

Ashinch mentioned this issue Jan 30, 2024

refactor(greader): incrementally fetch the unread items by difference set Ashinch/ReadYou#570

Merged

Alkarex mentioned this issue Oct 27, 2024

is this repo still being worked on or is dead??? yang991178/fluent-reader#657

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Greader API : get modified items #2566

Greader API : get modified items #2566

Shinokuni commented Oct 12, 2019

Alkarex commented Oct 12, 2019 •

edited

Loading

Frenzie commented Oct 12, 2019 •

edited

Loading

Shinokuni commented Oct 12, 2019

Frenzie commented Oct 12, 2019

Shinokuni commented Oct 12, 2019

Alkarex commented Oct 12, 2019

Shinokuni commented Oct 12, 2019

Alkarex commented Oct 12, 2019

Shinokuni commented Oct 12, 2019

Alkarex commented Oct 12, 2019 •

edited

Loading

Shinokuni commented Oct 13, 2019

Alkarex commented Feb 29, 2020

Shinokuni commented Dec 28, 2020

Alkarex commented Dec 28, 2020

Shinokuni commented Dec 28, 2020

Greader API : get modified items #2566

Greader API : get modified items #2566

Comments

Shinokuni commented Oct 12, 2019

Alkarex commented Oct 12, 2019 • edited Loading

Frenzie commented Oct 12, 2019 • edited Loading

Shinokuni commented Oct 12, 2019

Initial sync

Classic sync

Frenzie commented Oct 12, 2019

Shinokuni commented Oct 12, 2019

Alkarex commented Oct 12, 2019

Shinokuni commented Oct 12, 2019

Alkarex commented Oct 12, 2019

Shinokuni commented Oct 12, 2019

Alkarex commented Oct 12, 2019 • edited Loading

Shinokuni commented Oct 13, 2019

Alkarex commented Feb 29, 2020

Shinokuni commented Dec 28, 2020

Issues

Solution

Requests strategy

Database strategy

Result

Things left to do

Alkarex commented Dec 28, 2020

Shinokuni commented Dec 28, 2020

Alkarex commented Oct 12, 2019 •

edited

Loading

Frenzie commented Oct 12, 2019 •

edited

Loading

Alkarex commented Oct 12, 2019 •

edited

Loading