Skip to content

Hypermedia and APIs

Henry Andrews edited this page Oct 19, 2017 · 11 revisions

THIS PAGE IS OUTDATED

Hyper-Schema draft-07 is intended to stand on its own without needing this level of explanation.

Collecting feedback in issue #287

[DISCLAIMER: This is @handrews's view, and subject to debate and revision. It will probably move to the web site as it gains acceptance. If it gains acceptance]

JSON Hyper-Schema (JHS) provides mechanisms to describe links, but assumes a thorough understanding of hypermedia concepts on the part of the hyper-schema authors and consumers in order to use it correctly in an API. The knowledge necessary to fully understand how to make use of hypermedia is scattered across many RFCs, some less obvious than others. This page gathers the key points into a guide to understanding how to write and use JSON Hyper-Schema Link Description Objects (LDOs).

True hypermedia systems concentrate out-of-band knowledge in shared vocabularies. This page will not delve into that aspect of hypermedia in detail, as it is covered in many other places. The focus here is to explain how to use JHS as a media type in a proper hypermedia system, and what implications that has for the role of JSH in API design.

Anatomy of a hyperlink

As defined in RFC 5988, a link is "a link is a typed connection between two resources that are identified by Internationalised Resource Identifiers." Links are directional, pointing from a context resource to a target resource. The link's relation IRI identifies the link's type. (Note: Since Hyper-Schema does not currently address IRIs, this document will primarily discuss URIs, but the principles remain the same.)

These three IRIs/URIs (context, target, and relation) are the minimal elements needed to define a link, with the context most often implicitly defined by the position of the link within a document. Additionally, a link may include several other kinds of information, including presentation metadata, target hints, an explicit context URI, or a description of additional data that can be used with the link.

Presentation metadata is more or less irrelevant to link semantics. Fields such as "title" may clarify the link's semantics for humans, but are not expected to be parsed or analyzed by automated systems.

Target hints are hints because only the target resource itself can provide authoritative information about its representation. Despite being non-authoritative, target hints are often useful shortcuts for content negotiation and other things that, in an HTTP system, might require a HEAD or OPTIONS request to be issued first. However, requests that depend on target hints may fail or produce an unexpected response at runtime.

Links vs operations

There is an important and often-overlooked distinction between links (which express a relationship between two entities) and operations (the set of possible interactions that a client currently examining the context resource can have via the link with the target resource).

In a hypermedia system, links are explicitly defined, while operations are implied by link relations, URI schemes, and any protocol indicated by the URI scheme. Additionally, in JHS, the "submissionSchema" keyword is necessary to enable operations with the semantics of the target resource processing arbitrary data. For other operations, "targetSchema" may provide advisory information, but unlike with "submissionSchema", its absence does not imply any restriction on available operations.

Remote vs local operations

URI schemes that map to network protocols allow for remote operations. For a link whose URI scheme is http:// or https://, the possible operations correspond to the HTTP protocol methods.

Other URI schemes allow for local operations. The data: URI scheme is used to include a data string inline in another hypermedia format. The only operation you can do with such a URI is read its data.

Resources vs endpoints

Another confusion exists with resources (abstractions identified by a URI with which one may or may not be able to communicate) and endpoints (URLs to which a client can establish a connection in order to perform remote operations).

The concept of an endpoint is common to both RPC and REST, while resources are more or less specific to REST. Technically, there is a resource at an RPC endpoint, because everything is a resource. But unlike in REST, you do not interact with it through is own representation. It is simply a conduit through which you execute the calls.

In either system, you need endpoints in order to perform remote operations, whether they are RESTful interactions with the resource at that endpoint or not.

No guarantee of operations

Note that even if a URI is a URL indicating a protocol, or has a URI scheme that defines one or more local operations, there is no guarantee that it is suitable for use with those operations or that protocol. For instance, some link target URIs are intended only as identifiers, and may not be dereferenceable even if they appear to be.

This sort of restriction should be conveyed by the link's semantics. We will go into more detail shortly, but as an example, the "profile" link relation specifies that its target URI is an identifier only, and SHOULD NOT automatically be dereferenced even if it appears to be a URL based on its scheme.

API descriptions and operations

It's worth taking a slight digression here into API description systems such as OpenAPI, RAML, etc.

A single JSON Hyper-Schema corresponds to a single resource set defined by the possible expansions of its "self" link's URI Template. The combination of a JHS and an associated JSON instance function as a hypermedia representation of a single resource.

An API description document describes all possible endpoints and operations of a finite set of resource sets. This set of sets is generally closed and managed by some authority that controls the API. Most API description systems focus primarily or entirely on HTTP, with syntax that will not work with any other protocol.

Hypermedia formats work in terms of links and resources, and form open-ended systems of interconnected documents. They are used to represent resources in a hypermedia API, but do not correspond to an entire API. (Unless, of course, there is a resource representing the entire API.) While humans may perceive boundaries grouping hypermedia documents into "an API" or "a web site", these are not inherent in the formats themselves.

API description systems focus on operations and endpoints in a closed set of resource sets, including the various possible responses to each operation. They are intended to statically document and in some cases generate client code for an entire API.

Notably, the focus that API description systems put on operations allows them to describe link usage that does not correspond to standards such as RFC 7231: HTTP Semantics. Hypermedia formats do not explicitly describe operations, and therefore have no such capability.

JSON Hyper-Schema drafts and API description

Hyper-Schema draft-luff-json-hyper-schema-00, a.k.a. Draft 04, made changes to "method" that made it more operation-oriented than the drafts that preceded and followed it. Since Draft 04 became the de-facto standard in the more than three year gap before draft-wright-json-schema-hyperschema-00 (a.k.a. Draft 05), this has caused many Hyper-Schema users to view JHS as an API description system.

Draft 04 is the only JHS draft that used "method" to directly specify HTTP methods. This meant that multiple LDOs were necessary to describe multiple HTTP operations on the same link. LDOs became operation descriptions instead of true link descriptions.

This had a secondary effect of encouraging Hyper-Schema authors to treat "targetSchema" as the schema for the operation's response, even though it is defined to describe the target resource's representation in all drafts of JSON Hyper-Schema. A given operation's response may or may not match the target's representation, but in all cases the only authoritative schema for an operation's response is the one provided in the response.

Link and operation semantics

The biggest question in hypermedia systems, particularly truly RESTful APIs (that meet all required constraints from Fielding's dissertation), is how an automated client is supposed to understand what it can do.

A truly RESTful API should be usable by a generic client library, or agent, that understands the URI schemes, media types, protocols, and standardized link relations found within the API. Such an agent is to REST APIs as web browsers are to human-oriented hypermedia, a.k.a. the World Wide Web.

TODO: Explain how to determine semantics! It's kinda the whole point of this page.