HarperFast · kylebernhardy · Feb 6, 2026 · Feb 7, 2026 · Feb 7, 2026 · Feb 9, 2026
@@ -16,3 +16,7 @@ This repository contains "skills" that guide AI agents in developing Harper appl
 - [Handling Binary Data](skills/handling-binary-data.md): How to store and serve binary data like images or MP3s.
 - [Serving Web Content](skills/serving-web-content): Two ways to serve web content from a Harper application.
 - [Checking Authentication](skills/checking-authentication.md): How to use sessions to verify user identity and roles.
+- [Caching](skills/caching.md): How caching is defined and implemented in Harper applications.
+- [JWT Authentication](skills/jwt-authentication.md): How Harper issues and handles JWT authentication and RBAC for secure operations.
+- [Using Blobs](skills/using-blob-datatype.md): How to store and retrieve large data in HarperDB.
+- [Vector Indexing](skills/vector-indexing.md): How to define and use vector indexes for efficient similarity search.
@@ -0,0 +1,112 @@
+# Harper Caching
+
+Harper includes integrated support for **caching data from external sources**, enabling high-performance, low-latency cache storage that is fully queryable and interoperable with your applications. With built-in caching capabilities and distributed responsiveness, Harper makes an ideal **data caching server** for both edge and centralized use cases.
+
+---
+
+## What is Harper Caching?
+
+Harper caching lets you store **cached content** in standard tables, enabling you to:
+
+- Expose cached entries as **queryable structured data** (e.g., JSON or CSV)
+- Serve data to clients with **flexible formats and custom querying**
+- Manage cache control with **timestamps and ETags** for downstream caching layers
+- Implement **active or passive caching** patterns depending on your source and invalidation strategy
+
+---
+
+## Configuring a Cache Table
+
+Define a cache table in your `schema.graphql`:
+
+```graphql
+type MyCache @table(expiration: 3600) @export {
+  id: ID @primaryKey
+}
+```
+
+- `expiration` is defined in seconds
+- Expired records are refreshed on access
+- Evicted records are removed after expiration
+
+---
+
+## Connecting an External Source
+
+Create a resource:
+
+```js
+import { Resource } from "harperdb"
+
+export class ThirdPartyAPI extends Resource {
+	async get() {
+		const id = this.getId()
+		const response = await fetch(`https://api.example.com/items/${id}`)
+		if (!response.ok) {
+			throw new Error("Source fetch failed")
+		}
+		return await response.json()
+	}
+}
+```
+
+Attach it to your table:
+
+```js
+import { tables } from "harperdb"
+import { ThirdPartyAPI } from "./ThirdPartyAPI.js"
+
+const { MyCache } = tables
+MyCache.sourcedFrom(ThirdPartyAPI)
+```
+
+---
+
+## Cache Behavior
+
+1. Fresh data is returned immediately  
+2. Missing or stale data triggers a fetch  
+3. Concurrent misses are deduplicated  
+
+---
+
+## Active Caching
+
+Use `subscribe()` to proactively update or invalidate cache entries:
+
+```js
+class MyAPI extends Resource {
+  async *subscribe() {
+    // stream updates
+  }
+}
+```
+See [Real Time Apps](real-time-apps.md) for more details.
+
+---
+
+## Write-Through Caching
+
+Propagate updates upstream:
+
+```js
+class ThirdPartyAPI extends Resource {
+  async put(data) {
+    await fetch(`https://some-api.com/${this.getId()}`, {
+      method: "PUT",
+      body: JSON.stringify(data)
+    })
+  }
+}
+```
+
+---
+
+## Summary
+
+Harper Caching allows you to:
+
+- Cache external APIs efficiently
+- Query cached data like native tables
+- Prevent cache stampedes
+- Build real-time or write-through caches
@@ -0,0 +1,127 @@
+# JWT Authentication
+
+Harper uses **token-based authentication** with **JSON Web Tokens (JWTs)** to secure API and operations requests. JWT authentication enables long-lived sessions and token refresh without repeatedly sending Basic authentication credentials.
+
+---
+
+## Overview
+
+Harper JWT authentication uses two token types:
+
+- **operation_token**  
+  Used to authenticate Harper operations via the `Authorization: Bearer` header.  
+  Default expiration: **1 day**
+
+- **refresh_token**  
+  Used only to generate a new `operation_token` using the refresh operation.  
+  Default expiration: **30 days**
+
+If both tokens expire or are lost, new tokens can be generated using credentials.
+
+---
+
+## Creating Authentication Tokens
+
+To generate JWT tokens, call the operations API with a username and password. This request does not require authentication.
+
+```json
+{
+  "operation": "create_authentication_tokens",
+  "username": "username",
+  "password": "password"
+}
+```
+
+### cURL Example
+
+```bash
+curl --request POST http://localhost:9925   --header "Content-Type: application/json"   --data '{
+    "operation": "create_authentication_tokens",
+    "username": "username",
+    "password": "password"
+  }'
+```
+
+### Response
+
+```json
+{
+  "operation_token": "<JWT_OPERATION_TOKEN>",
+  "refresh_token": "<JWT_REFRESH_TOKEN>"
+}
+```
+
+---
+
+## Using JWT Authentication
+
+Once generated, include the `operation_token` in the `Authorization` header for all authenticated requests:
+
+Sample operation API request with JWT authentication:
+
+```bash
+curl --request POST http://localhost:9925   --header "Content-Type: application/json"   --header "Authorization: Bearer <JWT_OPERATION_TOKEN>"   --data '{
+    "operation": "search_by_hash",
+    "schema": "dev",
+    "table": "dog",
+    "hash_values": [1],
+    "get_attributes": ["*"]
+  }'
+```
+
+Sample REST request with JWT authentication:
+```bash
+curl --request GET http://localhost:9925/products/1 --header "Content-Type: application/json" --header "Authorization: Bearer <JWT_OPERATION_TOKEN>"
+```
+JWT authentication replaces Basic authentication for standard API operations and REST requests.
+
+---
+
+## Refreshing Tokens
+
+When an `operation_token` expires, use the `refresh_token` to generate a new one:
+
+```bash
+curl --request POST http://localhost:9925   --header "Content-Type: application/json"   --header "Authorization: Bearer <JWT_REFRESH_TOKEN>"   --data '{
+    "operation": "refresh_operation_token"
+  }'
+```
+
+This returns a new `operation_token` that can be used for subsequent requests.
+
+---
+
+## Configuration
+
+JWT expiration values can be configured in `harperdb-config.yaml`:
+
+| Setting | Description | Default |
+|-------|------------|---------|
+| operationsApi.authentication.operationTokenTimeout | Operation token lifetime | 1 day |
+| operationsApi.authentication.refreshTokenTimeout | Refresh token lifetime | 30 days |
+
+Adjust these values to match your security requirements.
+
+---
+
+## RBAC and JWT Authentication
+
+Harper Role-Based Access Control (RBAC) is **applied at request time** for all JWT-authenticated operations.
+
+- JWTs authenticate **which user** is making the request
+- RBAC determines **what that user is allowed to do**
+- Permissions are resolved dynamically and are **not encoded in the token**
+
+As a result:
+- Role or permission changes take effect **immediately**
+- JWTs do **not** need to be reissued when RBAC changes
+- Authentication (JWT) and authorization (RBAC) remain cleanly separated
+
+---
+
+## Summary
+
+- JWT authentication secures Harper operations using bearer tokens
+- Operation tokens authenticate requests
+- Refresh tokens renew expired operation tokens
+- Token lifetimes are configurable
@@ -0,0 +1,131 @@
+# Blob (Binary Large Objects)
+
+Harper supports **Blobs** — binary large objects for storing unstructured or large binary data — with integrated streaming support and efficient storage. Blobs are ideal for media files, documents, and any data where size or throughput makes standard JSON fields impractical.
+
+---
+
+## What Are Blobs
+
+Blobs extend the native JavaScript `Blob` type and allow you to store **binary or arbitrary data** inside Harper tables. The blob reference is stored in the record, while the blob’s contents are streamed to and from storage.
+
+- Designed for binary data such as images, audio, and documents
+- Supports streaming reads and writes
+- Blob data is stored separately from record attributes
+- Optimized for large payloads
+
+---
+
+## Defining Blob Fields
+
+Declare a blob field using the `Blob` type in your schema:
+
+```graphql
+type MyTable @table {
+  id: ID @primaryKey
+  data: Blob
+}
+```
+
+Any record written to this field will store a reference to the blob’s contents.
+
+---
+
+## Creating and Storing Blobs
+
+### Creating a Blob from a Buffer
+
+```js
+const blob = createBlob(largeBuffer)
+await MyTable.put({ id: "my-record", data: blob })
+```
+
+- `createBlob()` returns a blob reference
+- Data is streamed to storage asynchronously
+- Records may be committed before the blob finishes writing
+
+---
+
+### Creating a Blob from a Stream
+
+```js
+const blob = createBlob(stream)
+await MyTable.put({ id: "streamed-record", data: blob })
+```
+
+Streaming allows large data to be written without loading it fully into memory.
+
+---
+
+## Reading Blob Data
+
+Retrieve a record and read its blob contents:
+
+```js
+const record = await MyTable.get("my-record")
+const buffer = await record.data.bytes()
+```
+
+Blob objects also support streaming interfaces for large reads.
+
+---
+
+## Blob Attributes and Events
+
+### Size
+
+The blob size may not be immediately available when streaming:
+
+```js
+if (blob.size === undefined) {
+	blob.on("size", size => {
+		console.log("Blob size:", size)
+	})
+}
+```
+
+---
+
+### saveBeforeCommit
+
+Blobs are not atomic while streaming. To ensure the blob is fully written before committing the record:
+
+```js
+const blob = createBlob(stream, { saveBeforeCommit: true })
+await MyTable.put({ id: "safe-record", data: blob })
+```
+
+---
+
+## Error Handling
+
+Handle streaming errors by attaching an error listener:
+
+```js
+blob.on("error", () => {
+	MyTable.invalidate("my-record")
+})
+```
+
+This prevents partially written blobs from being used.
+
+---
+
+## Automatic Coercion
+
+When a field is defined as `Blob`, assigning a string or buffer automatically converts it into a blob when using `put`, `patch`, or `publish`.
+
+---
+
+## Related Skill
+
+- [Handling Binary Data with Blobs](handling-binary-data.md) How to store and serve binary data like images or MP3s using the Blob data type.
+
+---
+
+## Summary
+
+- Blobs store large or binary data efficiently
+- Blob fields reference streamed content
+- Supports buffered and streamed writes
+- Optional write-before-commit behavior
+- Integrates seamlessly with Harper tables