Skip to content

Conversation

@digithree
Copy link

@digithree digithree commented Jun 9, 2025

Fixes:

As you might know Pocket is ending their service in almost exactly a month from today. See their notice here: https://support.mozilla.org/en-US/kb/future-of-pocket

At this time on the 9th of July, the service will be in 'export only mode'. They have said that the API will continue to be operational until October 8th. This means this repo will not be useful after this date.

However, I've prepared a fix for some build issues I was getting on the current main as it had stopped working for me. I also added an export functionality to an open source self-host pocket alternative called Karakeep, see https://github.com/karakeep-app/karakeep (I recently contributed something to it, there is a good contributor community).

For anyone who still might be using your project and wishing to preserve their collection, the update in this PR would be useful. Have a look and see if you want to accept it.

In any case, thanks for this, I found it a good way to connect my Pocket articles to some coding projects I was working on locally.

digithree and others added 21 commits May 24, 2025 18:56
- Update click to >=8.2.1 (from unversioned)
- Update requests to >=2.32.3 (from unversioned)
- Keep sqlite-utils at >=2.4.4 (already up to date)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Update dependencies to latest versions
- Replace dict access with .get() method to handle missing keys gracefully
- Add comprehensive tests for both missing 'list' and 'since' key scenarios
- Prevents crashes when Pocket API returns unexpected response format

Fixes the stack trace:
KeyError: 'list' at utils.py line 119

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add check for items table existence before enabling FTS
- Add comprehensive tests for ensure_fts function edge cases:
  * When no items table exists (should not crash)
  * When items table exists (should create FTS)
  * When FTS already exists (should skip creation)

Fixes the stack trace:
sqlite3.OperationalError: no such table: items at utils.py line 68

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Change requests.get() to requests.post() for /v3/get endpoint
- Change requests.get() to requests.post() for /v3/stats endpoint
- Restore full progress bar functionality with total item count
- /v3/stats endpoint is functional but undocumented

The core issue was HTTP method, not deprecated endpoints.
All API calls now use POST as required by Pocket API.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add --debug flag to enable detailed logging
- Log API requests, responses, and item processing
- Track offset progression and item counts
- Help identify why articles aren't being fetched

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add proper Content-Type headers for API requests
- Use 'data' parameter instead of passing args directly to requests.post()
- Add error detection and logging for API error responses
- Update tests to use requests.post instead of requests.get
- Add test coverage for API error handling

Fixes the issue where API returns {'error': '...'} instead of data.
The Pocket API requires proper form-encoded requests with headers.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Reduce default page size from 500 to 50 items
- Add automatic fallback mechanism for 413 errors:
  * Detect 'Payload Too Large' errors
  * Automatically reduce page size by half (minimum 10)
  * Retry with smaller page size
- Add comprehensive test coverage for 413 error handling
- Continue processing with reduced page size instead of crashing

Fixes the issue where large accounts cause 413 errors due to
excessive payload size when requesting complete item details.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Change error detection from checking key existence to checking value
- Only treat as error if error key has non-None value
- Add test for success case with 'error': None in response
- Fixes false positive error detection when API returns success

The Pocket API returns {'error': None, 'list': {...}} for successful
responses, so we need to check the error value, not just key presence.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Handle numeric author_ids normally (existing schema)
- For string author_ids (alternative schema):
  * Treat the string as the author name
  * Generate deterministic integer ID using MD5 hash
  * Maintain integer author_id constraint in database
- Add comprehensive test coverage for:
  * String author IDs become names with generated IDs
  * Mixed numeric/string author IDs in same item
  * Consistent ID generation for same string values

Supports ~5-10% of Pocket items that use alternative author schema
without breaking existing database structure or functionality.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
…tamps

- Replace timestamp-based 'since' parameter with offset-based approach
- Track number of existing items in database for incremental fetching
- Start fetching from offset = count of existing items
- Remove problematic since/timestamp logic that wasn't working
- Add test coverage for start_offset functionality

This properly resumes fetching from where it left off by continuing
pagination from the number of items already stored in the database.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Comprehensive fixes: API connectivity, error handling, retry logic, and progress tracking
Implement complete export system to sync Pocket items to Karakeep:

**New Export Command:**
- `pocket-to-sqlite export` with full CLI interface
- Supports filtering by status (unread/archived/deleted) and favorites
- Includes limit/offset for batching and resume capability
- Dry-run mode for preview without API calls
- Progress bar with accurate tracking
- Comprehensive error handling and logging

**KarakeepClient API Integration:**
- REST API client with Bearer token authentication
- Configurable base URL (defaults to localhost:3000)
- Progressive retry logic for timeouts, rate limits (429), and server errors
- 30-second request timeout to prevent hanging
- Automatic rate limiting with 1-second delays between requests

**Authentication Extension:**
- Extends existing auth.json to include karakeep_token and karakeep_base_url
- Maintains compatibility with existing Pocket authentication
- Validates required Karakeep credentials before export

**Data Mapping & Validation:**
- Maps Pocket fields to Karakeep bookmark format:
  - resolved_title/given_title → title
  - excerpt → summary
  - resolved_url/given_url → url
  - type → "link" (constant)
- Skips items without URLs with proper logging
- Handles missing/null fields gracefully

**Robust Error Handling:**
- Retry logic for network timeouts and server errors
- Graceful handling of malformed data
- Comprehensive error reporting with item-level details
- Resume capability via offset parameter

**Comprehensive Test Coverage:**
- 7 new test cases covering all functionality
- Tests for KarakeepClient retry logic and error handling
- Export function tests with filters and edge cases
- Mock-based testing for isolated unit tests
- Tests for dry-run preview functionality

**CLI Features:**
- `--filter-status [0|1|2]` - Filter by Pocket status
- `--filter-favorite` - Export only favorited items
- `--limit` / `--offset` - Batching and resume support
- `--dry-run` - Preview mode without API calls
- `--silent` - Suppress progress output
- `--debug` - Enable detailed logging
- `--auth FILE` - Custom auth file path

**Example Usage:**
```bash
# Export all items with progress bar
pocket-to-sqlite export pocket.db

# Export only favorites with custom auth
pocket-to-sqlite export pocket.db --filter-favorite --auth myauth.json

# Preview first 10 unread items
pocket-to-sqlite export pocket.db --filter-status 0 --limit 10 --dry-run

# Resume export from offset 1000
pocket-to-sqlite export pocket.db --offset 1000
```

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Improve KarakeepClient to properly handle real Karakeep API responses:

**API Response Handling:**
- Handle 201 success responses with full bookmark object structure
- Parse 400 error responses with {code, message} format
- Add proper error handling for client errors (400-level) without retry
- Log successful bookmark creation with bookmark ID

**Enhanced Error Handling:**
- Client errors (400-499) now show proper error codes and messages
- No retry attempts for validation errors and other client errors
- Maintain retry logic only for server errors and rate limits
- Improved error messages with Karakeep error codes

**Updated Test Coverage:**
- Tests now use actual Karakeep API response structure
- Added test for 400 error response format handling
- Mock responses match real API format with proper fields
- Verified both success and error path handling

**Verified Working Integration:**
- Successfully tested against real Karakeep API
- Confirmed 201 responses with bookmark ID generation
- Debug logging shows proper API interaction flow
- All authentication and request formatting working correctly

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add detailed section covering Karakeep export functionality:

**New Documentation Sections:**
- Authentication setup for Karakeep integration
- Basic export command usage and examples
- Comprehensive filtering options (status, favorites)
- Batching and resuming capabilities for large exports
- Dry-run preview functionality
- Additional command options (auth file, silent mode, debug)
- Notes about built-in retry logic and error handling

**Clear Examples for:**
- Status filtering (unread/archived/deleted items)
- Favorite filtering
- Batched exports with limit/offset
- Preview mode with dry-run
- Custom authentication file usage
- Debug and silent operation modes

The documentation follows existing README style and provides users
with complete information for using the new export functionality.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add Karakeep export command with comprehensive functionality
- Enhanced KarakeepClient with get_all_tags() and add_tags_to_bookmark() methods
- Added tag caching and automatic ID generation for new tags
- Updated export logic to parse existing tags column and attach to Karakeep bookmarks
- Added graceful handling for databases with/without tags column
- Maintains backward compatibility with existing functionality
- All tests passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Fixed tag parsing to properly handle Pocket's nested JSON format: {"tag_name": {"tag": "tag_name", "item_id": "123"}}
- Extract tag names from the nested 'tag' field within each tag object
- Maintain fallback to key names if nested structure not found
- Updated comments with appropriate examples

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add logging to show tags_data from database for each item
- Add logging to show parsed tag names and count
- Add specific logging for items with no tags vs no bookmark_id
- Help troubleshoot tag processing during export

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add tagging support to Karakeep export
Copilot AI review requested due to automatic review settings June 9, 2025 20:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates dependency constraints, refactors the incremental fetch logic to use offsets, and introduces a new export command for exporting items to Karakeep.

  • Bump click and requests versions to restore build compatibility
  • Replace “since”-based fetching with offset-based logic and add a --debug flag
  • Add export command (with filtering, dry-run, and progress support) and corresponding README docs

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

File Description
pyproject.toml Tighten click and requests version requirements to latest compatible releases
pocket_to_sqlite/cli.py Refactor fetch to use offset-based fetching; add --debug to both fetch and new export command
README.md Document new export command for Karakeep integration and usage examples
Comments suppressed due to low confidence (3)

pocket_to_sqlite/cli.py:85

  • [nitpick] The parameter name all shadows the built-in all() function. Consider renaming it to fetch_all for clarity.
def fetch(db_path, auth, all, silent, debug):

pocket_to_sqlite/cli.py:146

  • The README mentions karakeep_base_url but the code does not validate or use it. Add a check for karakeep_base_url and ensure it’s passed to the export logic.
if "karakeep_token" not in auth_data:

pocket_to_sqlite/cli.py:114

  • New export functionality lacks accompanying tests. Consider adding unit or integration tests to cover filtering, dry-run, and actual export flows.
@cli.command()

Fix improper usage of count() method

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant