Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
166 changes: 89 additions & 77 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,122 +2,134 @@

## What This Is

**DEPRECATED**: Legacy PHP library for connecting to AWS Elasticsearch Service. This library has been superseded by `aws-opensearch-php-handler`, which uses the OpenSearch client instead of the deprecated Elasticsearch client.
> **DEPRECATED — DO NOT USE FOR NEW PROJECTS**
>
> This library has been fully superseded by [`aws-opensearch-php-handler`](https://github.com/bfansports/aws-opensearch-php-handler).
> AWS renamed Elasticsearch Service to OpenSearch Service in 2021. This library uses the abandoned `elasticsearch/elasticsearch` v6.x PHP client.
> **No consumers in the bFAN codebase actively use this library** — migration is complete.
> This repo should be archived on GitHub.

A PHP library for querying AWS Elasticsearch (pre-OpenSearch migration). Provides Lucene query syntax interface for searching, counting, and retrieving documents.
Legacy PHP library for connecting to AWS Elasticsearch Service. Provides a Lucene query syntax interface for searching, counting, aggregating, and managing documents via AWS Signature V4-authenticated requests.

## Tech Stack

- **Language**: PHP 7.0+
- **AWS SDK**: `aws/aws-sdk-php` 3.x — AWS authentication and signing
- **Elasticsearch Client**: `elasticsearch/elasticsearch` (version not pinned in composer.json)
- **Autoloading**: PSR-0 (namespace `SA`)

## Quick Start

```bash
# Installation (deprecated — use aws-opensearch-php-handler instead)
composer require bfansports/aws-elasticsearch-php-handler

# Usage (same API as OpenSearch version)
use SA\ElasticsearchHandler;
## Deprecation & Migration Status

$client = new ElasticsearchHandler(["https://search-domain.region.es.amazonaws.com:443"]);
| Item | Status |
|------|--------|
| Successor repo | `bfansports/aws-opensearch-php-handler` (active, v1.0.10+) |
| sa_site_v2 (admin panel) | Migrated to OpenSearch handler in `composer.json` |
| sa_site_daemons | Migrated to OpenSearch handler |
| PHP code imports | **No PHP files** import `ElasticsearchHandler` anywhere in the codebase |
| composer.lock residue | `sa_site_v2/scripts/composer.lock` still resolves to this repo's git URL (stale lock — see Gotchas) |
| GitHub repo status | **Not archived** (should be) |
| Packagist | May have namespace overlap with opensearch handler — needs verification |

$index = "organizations";
$query = "name:\"Lakers\" AND active:true";
$count = 20;
$sort = "createdAt:desc";
$type = "organization"; // Types are deprecated in ES 7+
### Migration Checklist (for any remaining consumers)

// Get full ES response with metadata
$results = $client->raw($index, $query, $count, $sort, $type);
1. Replace `bfansports/aws-elasticsearch-php-handler` with `bfansports/aws-opensearch-php-handler` in `composer.json`
2. Change `use SA\ElasticsearchHandler;` to `use SA\OpensearchHandler;`
3. Change `new ElasticsearchHandler()` to `new OpensearchHandler()`
4. Remove `$type` parameter from all method calls (OpenSearch 2+ does not support types)
5. Run `composer update` to regenerate lock file
6. Test queries — API is otherwise identical

// Get just the source documents (convenience method)
$docs = $client->query($index, $query, $count, $sort, $type);
## Tech Stack

// Get count only (no document retrieval)
$totalCount = $client->count($index, $query, $type);
```
- **Language**: PHP 7.0+ (EOL — OpenSearch handler requires PHP 7.4+)
- **AWS SDK**: `aws/aws-sdk-php` 3.x — AWS authentication and signing
- **Elasticsearch Client**: `elasticsearch/elasticsearch` ~6.7 (abandoned, EOL)
- **HTTP Transport**: Guzzle via `GuzzleHttp\Ring` (abandoned library)
- **Autoloading**: PSR-0 (deprecated; OpenSearch handler also uses PSR-0)

## Project Structure

- `src/SA/ElasticsearchHandler.php` — Main handler class
- `composer.json` — Dependencies and autoload config
- `LICENSE` — MIT-style license
- `README.md` — Usage instructions
- `.github/` — GitHub Actions workflow (if present)
```
src/SA/ElasticsearchHandler.php # Main handler class (346 lines)
composer.json # Dependencies and autoload config
LICENSE # MIT license
README.md # Usage instructions (outdated parameter order)
.github/workflows/ # S3 backup workflow (targets wrong branch)
```

## Dependencies

**External:**
- AWS Elasticsearch Service (pre-OpenSearch migration)
- AWS Elasticsearch Service (now OpenSearch Service)
- AWS IAM credentials — for signing requests

**Consumed by:**
- Legacy bFAN PHP services (if any still use this)
- `sa_site_v2` (migrated to `aws-opensearch-php-handler`)

<!-- Ask: Are there any components still using this library, or has everything migrated to aws-opensearch-php-handler? -->
<!-- Ask: Should this repo be archived or marked as deprecated in GitHub? -->
- **No active consumers** — migration to `aws-opensearch-php-handler` is complete
- Stale reference in `sa_site_v2/scripts/composer.lock` (lock file only, not composer.json)

## API / Interface

**Main class**: `SA\ElasticsearchHandler`

**Constructor:**
```php
new ElasticsearchHandler(array $hosts)
new ElasticsearchHandler(array $endpoints)
```
- `$hosts` — Array of Elasticsearch endpoint URLs (HTTPS)
- `$endpoints` — Array of Elasticsearch endpoint URLs (HTTPS)
- Reads `$_SERVER['AWS_DEFAULT_REGION']` directly (no parameter, no fallback)
- Reads `$_SERVER['AWS_PROFILE']` for SSO credential provider

**Query methods (all accept raw Lucene query strings):**
- `raw($index, $query, $count = 1, $sort = null, $offset = 0, $type = null)` — Full ES response
- `query(...)` — Same params as `raw()`, returns `_source` documents only
- `count($index, $query, $type = null)` — Document count
- `aggregate($index, $query, $data, $type = null)` — Aggregation results
- `scan($index, $query, $type = null)` — Paginated full-index scan (unbounded memory)
- `search($params = [])` — Raw pass-through to ES client

**Index management:**
- `createIndex($index)`, `deleteIndex($index)`, `indexExists($index)`
- `getIndexSettings($params)`, `putIndexSettings($params)`
- `getIndexMapping($params)`, `putIndexMapping($params)`
- `updateIndexAliases($params)`, `getIndexAliases()`
- `reindex($params)`, `indices()`

**Document operations:**
- `createDocument($index, $data, $id = null, $type = null)`
- `deleteDocument($index, $id)`

**Methods:**
- `raw($index, $query, $count = 10, $sort = "", $type = null)` — Returns full ES response with metadata
- `query($index, $query, $count = 10, $sort = "", $type = null)` — Returns array of source documents only
- `count($index, $query, $type = null)` — Returns total count of matching documents
## Key Patterns

**Query syntax**: Lucene query string syntax (e.g., `field:value AND other:foo`)
- **AWS Signature V4**: All requests signed via `Aws\Signature\SignatureV4`
- **RingPHP bridge**: Converts PSR-7 responses to RingPHP format for the ES client
- **Lucene query strings**: Uses `query_string` query type (injection risk if user input is unsanitized)
- **search_after pagination**: `scan()` method uses `search_after` for deep pagination

## Key Patterns
## Security Findings (from AI Audit 2026-02-17)

- **AWS Signature V4 authentication**: Uses AWS SDK to sign requests to Elasticsearch
- **IAM role-based access**: Relies on IAM role permissions
- **Lucene query strings**: Simple query format (not full Elasticsearch DSL)
- **Type parameter**: Supports Elasticsearch types (deprecated in ES 7+, removed in OpenSearch 2+)
- **Query injection**: All `$query` parameters pass raw Lucene syntax to `query_string` — no sanitization. Callers must escape special characters if handling user input.
- **Silent error swallowing**: HTTP error callback returns `$error['response']` without logging. Can mask failures and produce confusing TypeErrors downstream.
- **No request timeout**: Only credential provider has a timeout. HTTP requests can hang indefinitely.
- **Undeclared `$cacheKey` property**: `getCacheKey()`/`setCacheKey()` use a dynamic property — triggers `E_DEPRECATED` on PHP 8.2+.

## Environment

**Required IAM permissions:**
- `es:ESHttpGet` on the Elasticsearch domain
- `es:ESHttpPost` if using write operations (not exposed by current API)
**Required:**
- `$_SERVER['AWS_DEFAULT_REGION']` — AWS region (no fallback; errors if unset)
- IAM credentials via role, environment vars, or `~/.aws/credentials`

**AWS credentials**: Must be available via IAM role or environment variables.
**Optional:**
- `$_SERVER['AWS_PROFILE']` — Uses SSO credential provider when set

## Deployment

**Deprecated**: Do not deploy new versions. Migrate consumers to `aws-opensearch-php-handler`.

**Migration path:**
1. Replace `bfansports/aws-elasticsearch-php-handler` with `bfansports/aws-opensearch-php-handler` in `composer.json`
2. Change `use SA\ElasticsearchHandler;` to `use SA\OpensearchHandler;`
3. Update class instantiation: `new ElasticsearchHandler()` → `new OpensearchHandler()`
4. Remove `$type` parameter from method calls (OpenSearch 2+ does not support types)
5. Test queries and verify results
**DO NOT DEPLOY.** This library is deprecated. Migrate consumers to `aws-opensearch-php-handler`.

## Testing

<!-- Ask: Does this repo have unit tests? -->

**Manual testing**: Not recommended. Use `aws-opensearch-php-handler` instead.
No unit tests exist. No test infrastructure. Manual testing only against a live ES cluster.

## Gotchas

- **DEPRECATED**: AWS Elasticsearch Service was rebranded to OpenSearch Service in 2021. This library uses the deprecated `elasticsearch-php` client.
- **Elasticsearch types removed**: Elasticsearch 7.x deprecated types, and OpenSearch 2.x removed them entirely. The `$type` parameter in this library is obsolete.
- **No longer maintained**: This library should not receive updates. All development effort goes to `aws-opensearch-php-handler`.
- **Composer version conflicts**: The `elasticsearch/elasticsearch` dependency may conflict with other packages that require newer versions.
- **Security updates**: The Elasticsearch PHP client may have unpatched vulnerabilities. Migrate to OpenSearch ASAP.

<!-- Ask: What's the timeline for fully deprecating this library? -->
<!-- Ask: Are there any blocking issues preventing migration to aws-opensearch-php-handler? -->
<!-- Ask: Should we add a deprecation notice to the README and composer.json? -->
- **DEPRECATED**: AWS Elasticsearch Service was rebranded to OpenSearch Service in 2021. This library uses the abandoned `elasticsearch-php` v6.x client.
- **Stale composer.lock**: `sa_site_v2/scripts/composer.lock` maps the OpenSearch package name to this repo's git URL. This is a Packagist/Composer resolution artifact. Running `composer update` in that directory should fix it, but verify the resolved URL changes to the OpenSearch repo.
- **README parameter order is wrong**: README shows `raw($index, $query, $count, $sort, $type)` but actual signature has `$offset` before `$type`.
- **`scan()` loads everything into memory**: No streaming, no memory limit. Can OOM on large indices.
- **Backup workflow targets `develop` branch**: But default branch is `master`. Workflow likely never runs.
- **`createDocument()` derives type from index name**: Uses `explode("_", $index)[0]` as the default `$type` — fragile naming convention.
- **`getRetriesCount()` / `setRetriesCount()` are dead code**: Empty bodies, return null.
- **Elasticsearch types are removed**: The `$type` parameter throughout this library is obsolete (removed in ES 7+, OpenSearch 2+).
- **`actions/checkout@v2`**: GitHub Actions workflow uses deprecated action version (Node.js 12 EOL).
Loading