API Design proposal

Reading list is a service to keep track of a list of articles to read. The state of the reading list is synchronized between devices.

Contents

Authentication
- Hawk
Data model
API
Batch operations
- POST /batch
- Massive operations
Server timestamps
Backoff indicators
Retry-After indicators
Error responses
API versioning
- Deprecation

Authentication

Use the OAuth token with this header:

Authorization: Bearer <oauth_token>

note:	This approach is straightforward, but implies the reading list API to check the token on the FxA server at each request. This could be avoided using Hawk, which would instead associate an encryption token by FxA user id.

Obtain the token

Navigate the client to /v1/fxa-oauth/login. There, a session cookie will be set, and the client will be redirected to a login form on the FxA content server
After submitting the credentials on the login page, the client will be redirected to /v1/fxa-oauth/token, where its session will be validated. Then, an OAuth token will be returned inside a JSON object:

{
    "token": "oihyvuh-fefe-ldpieo98963fyhrn"
}

Reading list scope

The reading list API will eventually have to handle a dedicated OAuth scope (e.g. readinglist, readinglist:read, readinglist:write). This will help users to delegate access to the readinglist to third party apps

So far the FxA server only handles the profile scope.

See https://github.com/mozilla-services/readinglist/issues/16.

Hawk

(Draft, undecided)

A Hawk token could be generated from a OAuth bearer token, a Basic auth login/password or a simple string.

Pros and cons

Session is persisted in the API, the bearer token is not checked at each request
If FxA is down, sessions may still allow to authenticate requests
User authentication could be decorrelated from Firefox Accounts
Clients could start using the API without FxA (e.g. single device)
Requests cannot be replayed by an attacker
Obtained Hawk tokens cannot be used anywhere else, contrary to Bearer token which can be reused within the whole FxA eco-system

Data model

Articles

Attribute	Type	Comment
`id`	UUID
`last_modified`	Timestamp	server timestamp
`url`	URL	valid (RFC)
`title`	String(1024)	1 character min.
`resolved_url`	URL
`resolved_title`	String(1024)
`excerpt`	Text	first 200 words of the article
`preview`	Text	URL to preview image of article
`status`	Enum	{0: OK, 1: archived, 2: deleted}
`favorite`	Boolean
`is_article`	Boolean
`word_count`	Integer
`unread`	Boolean
`added_by`	Device	device name (cf. issue #23)
`added_on`	Timestamp	device timestamp
`stored_on`	Timestamp	server timestamp
`marked_read_by`	Device	device name (cf. issue #23)
`marked_read_on`	Timestamp	device timestamp
`read_position`	Integer	Words read from the beginning

API

GET /

The returned value is a JSON mapping containing:

hello: the name of the service ("reading list")
version: complete version ("X.Y.Z")
url: absolute URI (without a trailing slash) of the API (can be used by client to build URIs)
eos: date of end of support in ISO 8601 format ("yyyy-mm-dd", null if unknown)

GET /heartbeat

Return the status of each service the reading list depends on. The returned value is a JSON mapping containing:

database true if operational

Return 200 if the connection with each service is working properly and 503 if something doesn't work.

GET /articles

Requires an FxA OAuth authentication

Returns all articles of the current user.

The returned value is a JSON mapping containing:

items: the list of articles, with exhaustive attributes

See all article attributes

A Total-Records header is sent back to indicate the total number of records included in the response.

A header Last-Modified will provide the current timestamp of the collection (see Server timestamps section). It is likely to be used by client to provide If-Modified-Since or If-Unmodified-Since headers in subsequent requests.

Filtering

Single value

/articles?unread=true

Multiple values

/articles?status=1,2

Minimum and maxium

Prefix attribute name with min_ or max_:

/articles?min_word_count=4000

note:	The lower and upper bounds are inclusive (i.e equivalent to greater or equal).

Exclude

Prefix attribute name with not_:

/articles?not_status=0

note:	Will return an error if an attribute is unknown.
note:	The `Last-Modified` response header will always be the same as the unfiltered collection.

Sorting

/articles?_sort=-last_modified,title

note:	Articles will be ordered by `-stored_on` by default (i.e. newest first).
note:	Ordering on a boolean field gives `true` values first.
note:	Will return an error if an attribute is unknown.

Counting

In order to count the number of records, by status for example, without fetching the actual collection, a HEAD request can be used. The Total-Records response header will then provide the total number of records.

Polling for changes

The _since parameter is provided as an alias for min_last_modified (greater or equal).

/articles?_since=123456

The new value of the collection latest modification is provided in headers (see Server timestamps section).

When the since parameter is provided, every deleted articles will appear in the list with a deleted status (status=2).

If the request header If-Modified-Since is provided, and if the collection has not suffered changes meanwhile, a 304 Not Modified response is returned.

Pagination

Paging is performed through a _limit parameter and a Next-Page response header.

Client should begin by issuing a GET /articles?_limit=<LIMIT> request, which will return up to <LIMIT> items.

/articles?_limit=100

If there were additional items matching the query, the response will be a 206 Partial Content and include a Next-Page header containing the next page full URL.

To fetch additional items, the next request is performed on the URL obtained from Next-Page header. This process is repeated until the response does not include the Next-Page header.

note:	Using the `Next-Page` technique (i.e. continuation tokens) the implementation of pagination is completely hidden from clients, and thus completely interchangeable.

Pagination on a filtered collection should not be obstructed by modification or creation of non matching records.

To guard against other clients making concurrent changes to the collection, the next page URL will contain information about the collection obtained on the first pagination call.

note:	Will return an error if limit has invalid values (e.g. non integer or above maximum)
note:	Will return a `412 Precondition failed` error if a modification has occured since the first pagination call. Pagination should be restarted from the first page, i.e. without pagination parameters.

List of available URL parameters

<prefix?><attribute name>: filter by value(s)
_since: polling changes
_sort: order list
_limit: pagination max size

Some additional internal parameters are used by pagination. Client should not be aware of them, since they are set and provided through the Next-Page header.

_page_token: pagination continuation token

Combining all parameters

Filtering, sorting and paginating can all be combined together.

/articles?_sort=-last_modified&_limit=100

POST /articles

Requires an FxA OAuth authentication

Used to create an article on the server. The POST body is a JSON mapping containing:

url
title
added_by

note:	Since the device which added the article can differ from the current device (e.g. while importing), the device name is not provided through a request header.

The POST response body is the newly created record, if all posted values are valid. A success response is 201 Created. Additional optional attributes can also be specified:

If the request header If-Unmodified-Since is provided, and if the record has changed meanwhile, a 412 Precondition failed error is returned.

Optional values

added_on
excerpt
favorite
unread
status
is_article
resolved_url
resolved_title

Auto default values

For v1, the server will assign default values to the following attributes:

id: uuid
resolved_url: url
resolved_title: title
excerpt: empty text
status: 0-OK
favorite: false
unread: true
is_article: true
last_modified: current server timestamp
stored_on: current server timestamp
marked_read_by: null
marked_read_on: null
word_count: null

For v2, the server will fetch the content, and assign the following attributes with actual values:

resolved_url: the final URL obtained after all redirections resolved
resolved_title: The fetched page's title (content of <title>)
excerpt: The first 200 words of the article
word_count: Total word count of the article

Validation

If the posted values are invalid (e.g. added_on is not an integer) an error response is returned with status 400. See details on error responses.

note:	The `status` can take only `0` (OK) and `1` (archived), even though the server sets it to `2` when including deleted articles in the collection.
note:	(undecided) For some cases, it can make sense for the server to fix arbitrarily validation errors on records (e.g. truncating long titles).

Conflicts

Articles URL are unique per user (both url and resolved_url).

note:	A `url` always resolves towards the same URL. If `url` is not unique, then its `resolved_url` won't either.
note:	Unicity on URLs is determined the full URL, including location hash. (e.g. http://news.com/day-1.html#paragraph1, http://spa.com/#/content/3)
note:	Deleted items are not taken into account for URL unicity. Delete-then-add will succeed.

If an article is created with an URL that already exists, a 303 See Other response is returned to indicate the existing record.

The response body is a JSON mapping, with the following attribute:

id: the id of the conflicting record

GET /articles/<id>

Requires an FxA OAuth authentication

Returns a specific article by its id.

For convenience and consistency, a header Last-Modified will also repeat the value of last_modified.

If the request header If-Modified-Since is provided, and if the record has not changed meanwhile, a 304 Not Modified is returned.

note:	Even though article URLs are unique together, we use the article id field to target individual records.

DELETE /articles/<id>

Requires an FxA OAuth authentication

Delete a specific article by its id.

The DELETE response is the record that was deleted.

If the record is missing (or already deleted), a 404 Not Found is returned. The client might decide to ignore it.

If the request header If-Unmodified-Since is provided, and if the record has changed meanwhile, a 412 Precondition failed error is returned.

note:	Once deleted, an article will appear in the collection with a deleted status (`status=2`) and will have most of its fields empty.
note:	The server will have to implement an internal mechanism to will keep track of deleted items, and purge them eventually.

PATCH /articles/<id>

Requires an FxA OAuth authentication

Modify a specific article by its id. The PATCH body is a JSON mapping containing a subset of articles fields.

The PATCH response is the modified record (full).

Modifiable fields

title
excerpt
favorite
unread
status
is_article
resolved_url
resolved_title
read_position

If the record is missing (or already deleted), a 404 Not Found error is returned. The client might decide to ignore it.

If the request header If-Unmodified-Since is provided, and if the record has changed meanwhile, a 412 Precondition failed error is returned.

note:	`last_modified` is updated to the current server timestamp.
note:	Changing `read_position` never generates conflicts.
note:	`read_position` can only be changed for a greater value than the current one.
note:	If `unread` is changed to false, `marked_read_on` and `marked_read_by` are expected to be provided.
note:	If `unread` was already false, `marked_read_on` and `marked_read_by` are not updated with provided values.
note:	If `unread` is changed to true, `marked_read_by` and `marked_read_on` are changed automatically to null.
note:	As mentionned in the Validation section, an article status cannot take the value `2`.

Conflicts

(Draft)

If the modification of resolved_url introduces a conflict, because another record violates unicity, a 409 Conflict error response is returned.

The error attributes will be set:

info: the URL of the conflicting record

Batch operations

Requires an FxA OAuth authentication

POST /batch

The POST body is a mapping, with the following attributes:

requests: the list of requests
defaults: (optional) values in common for all requests

Each request is a JSON mapping, with the following attribute:

method: HTTP verb
path: URI
body: a mapping
headers: (optional), otherwise take those of batch request

{
  "defaults": {
    "method" : "POST",
    "path" : "/articles",
    "headers" : {
      ...
    }
  },
  "requests": [
    {
      "body" : {
        "title": "MoFo",
        "url" : "http://mozilla.org"
      }
    },
    {
      "body" : {
        "title": "MoCo",
        "url" : "http://mozilla.com"
      }
    },
    {
      "method" : "PATCH",
      "path" : "/articles/409",
      "body" : {
        "read_position" : 3477
      }
    }
  ]
]

The response body is a list of all responses:

{
  "defaults": {
    "path" : "/articles",
  },
  "responses": [
    {
      "path" : "/articles/409",
      "status": 200,
      "body" : {
        "id": 409,
        "url": "...",
        ...
        "read_position" : 3477
      },
      "headers": {
        ...
      }
    },
    {
      "status": 201,
      "body" : {
        "id": 411,
        "title": "MoFo",
        "url" : "http://mozilla.org",
        ...
      },
    },
    {
      "status": 201,
      "body" : {
        "id": 412,
        "title": "MoCo",
        "url" : "http://mozilla.com",
        ...
      },
    },
  ]
]

note:	The responses are not necessarily in the same order of the requests.

Pros & Cons

This respects REST principles
This is easy for the client to handle, since it just has to pile up HTTP requests while offline
It looks to be a convention for several REST APIs (Neo4J, Facebook, Parse)
Payload of response can be heavy, especially while importing huge collections

Massive operations

(Undecided, Draft)

In order to limit the size of reponses payloads, a request header Light-Response can be added. Only status and body attributes will be returned, and only fields specified in the header will be included.

For example, with Light-Response: id, stored_on, errno, info:

{
  "responses": [
    {
      "status": 200,
      "body" : {
        "id": "409",
        "stored_on": "1234567"
      }
    },
    {
      "status": 201,
      "body" : {
        "id": 412,
        "stored_on": "988767568"
      }
    },
    {
      "status": 409,
      "body" : {
        "errno": 122,
        "info": "http://server/v1/articles/970",
      }
    },
    {
      "status": 303,
      "body" : {
        "id": "667",
      }
    }
  ]
]

Server timestamps

In order to avoid race conditions, all timestamps manipulated by the server are not true HTTP date values, nor milliseconds EPOCH timestamps.

They are milliseconds EPOCH timestamps with the guarantee of a change per timestamp update. If two changes happen at the same millisecond, they will have two differents timestamps.

The Last-Modified header with the last timestamps of the collection for a given user will be given on collection and record GET's endpoints.

Last-Modified: 1422375916186

note:	Both fields `added_on` and `marked_on` will contain actual timestamps (from device perspective), used for calendar year information display.

All timestamp of the app will be set in milliseconds.

Backoff indicators

A Backoff header will be added to the success responses (>=200 and <400) when the server is under heavy load. It provides the client with a number of seconds during which it should avoid doing unnecessary requests.

Backoff: 30

note:	The back-off time is configurable on the server.
note:	This feature could be handled by videur
note:	In other implementations at Mozilla, there was `X-Weave-Backoff` and `X-Backoff` but the `X-` prefix for header has been deprecated since.

Retry-After indicators

A Retry-After header will be added to error responses (>=500), telling the client how many seconds it should wait before trying again.

Retry-After: 30

Error responses

Every response is JSON.

If the HTTP status is not OK (<200 or >=400), the response contains a JSON mapping, with the following attributes:

code: matches the HTTP status code (e.g 400)
errno: stable application-level error number (e.g. 109)
error: string description of error type (e.g. "Bad request")
message: context information (e.g. "Invalid request parameters")
info: additional details (e.g. URL to error details)
validation: optional information on invalid posted data

Example response

{
    "code": 400,
    "errno": 109,
    "error": "Bad Request",
    "message": "Invalid posted data",
    "info": "https://server/docs/api.html#errors",
    "validation": [{
        "name": "title"
        "description": "Required",
        "location": "body",
    },
    {
        "name": "url"
        "description": "Invalid URL format",
        "location": "body",
    }]
}

status code	errno	description
401	104	Missing Authorization Token
401	105	Invalid Authorization Token
400	106	request body was not valid JSON
400	107	invalid request parameter
400	108	missing request parameter
400	109	invalid posted data
404	110	Invalid Token / id
404	111	Missing Token / id
403	121	Resource's access forbidden for this user
409	?	Another resource violates constraint
411	112	Content-Length header was not provided
412	114	Resource was modified meanwhile
413	113	Request body too large
429	117	Client has sent too many requests
500	999	Internal Server Error
503	201	Service Temporary unavailable due to high load
5??	202	Client version too old
513	?	Service Decommissioned

API versioning

The API versioning is based on the application version deployed. It follows semver.

During development the server will be 0.X.X, the server endpoint will be prefixed by /v0.

Each non retro-compatible API change will imply the major version number to be incremented. Everything will be made to avoind retro incompatible changes.

The / endpoint will redirect to the last API version.

Deprecation

A track of the client version will be kept to know after which date each old version can be shutdown. The date of the end of support is provided in the API root URL (e.g. /v0)

Using the Alert header, the server can communicate any potential warning messages, information, or other alerts. The value is JSON mapping with the following attributes:

code: one of the strings "deprecated-client", "soft-eol" or "hard-eol"
message: a human-readable message
url: a URL at which more information is available

A 503 Service Unavailable error response can be returned if the client version is too old.

A 513 Service Decommissioned error response can be returned indicating that the service has been replaced with a new and better service using some as-yet-undesigned protocol.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Design proposal

Authentication

Hawk

Data model

API

GET /

GET /heartbeat

GET /articles

Filtering

Sorting

Counting

Polling for changes

Pagination

List of available URL parameters

Combining all parameters

POST /articles

Validation

Conflicts

GET /articles/<id>

DELETE /articles/<id>

PATCH /articles/<id>

Conflicts

Batch operations

POST /batch

Pros & Cons

Massive operations

Server timestamps

Backoff indicators

Retry-After indicators

Error responses

API versioning

Deprecation

Uh oh!

Uh oh!

Clone this wiki locally