diff --git a/aip/general/0233.md b/aip/general/0233.md index 23f83b2464..1b9144eaca 100644 --- a/aip/general/0233.md +++ b/aip/general/0233.md @@ -15,7 +15,9 @@ transaction. A batch create method provides this functionality. ## Guidance -APIs **may** support Batch Create using the following pattern: +APIs **may** support Batch Create using the following two patterns: + +Returning the response synchronously ```proto rpc BatchCreateBooks(BatchCreateBooksRequest) returns (BatchCreateBooksResponse) { @@ -26,25 +28,51 @@ rpc BatchCreateBooks(BatchCreateBooksRequest) returns (BatchCreateBooksResponse) } ``` +Returning an Operation which resolves to the response asynchronously + +```proto +rpc BatchCreateBooks(BatchCreateBooksRequest) returns (google.longrunning.Operation) { + option (google.api.http) = { + post: "/v1/{parent=publishers/*}/books:batchCreate" + body: "*" + }; + option (google.longrunning.operation_info) = { + response_type: "BatchCreateBooksResponse" + metadata_type: "BatchCreateBooksOperationMetadata" + }; +} +``` + - The RPC's name **must** begin with `BatchCreate`. The remainder of the RPC name **should** be the plural form of the resource being created. - The request and response messages **must** match the RPC name, with `Request` and `Response` suffixes. - - However, in the event that the request may take a significant amount of - time, the response message **must** be a `google.longrunning.Operation` - which ultimately resolves to the `Response` type. +- If the batch method returns an `google.longrunning.Operation`, both the + `response_type` and `metadata_type` fields **must** be specified. - The HTTP verb **must** be `POST`. - The HTTP URI **must** end with `:batchCreate`. - The URI path **should** represent the collection for the resource, matching the collection used for simple CRUD operations. If the operation spans parents, a dash (`-`) **may** be accepted as a wildcard. - The body clause in the `google.api.http` annotation **should** be `"*"`. -- The operation **should** be atomic: it **should** fail for all resources or - succeed for all resources (no partial success). - - If the operation covers multiple locations and at least one location is - down, the operation **must** fail. - - In cases where supporting partial responses cannot be avoided, the design - should follow the guidelines of [AIP-193](https://aip.dev/193). + +### Atomic vs. Partial Success + +- The batch create method **may** support atomic (all resources created or none + are) or partial success behavior. To make a choice, consider the following + factors: + - **Complexity of Ensuring Atomicity:** Operations that are simple + passthrough database transactions **should** use an atomic operation, + while operations that manage complex resources **should** use partial + success operations. + - **End-User Experience:** Consider the perspective of the API consumer. + Would atomic behavior be preferable for the given use case, even if it + means that a large batch could fail due to issues with a single or a few + entries? +- Synchronous batch create **must** be atomic. +- Asynchronous batch create **may** support atomic or partial success. + - If supporting partial success, see + [Operation metadata message](#operation-metadata-message) requirements. ### Request message @@ -111,11 +139,95 @@ message BatchCreateBooksResponse { - The response message **must** include one repeated field corresponding to the resources that were created. +### Operation metadata message + +- The `metadata_type` message **must** either match the RPC name with + `OperationMetadata` suffix, or be named with `Batch` prefix and + `OperationMetadata` suffix if the type is shared by multiple Batch methods. +- If batch create method supports partial success, the metadata message **must** + include a `map failed_requests` field to communicate + the partial failures. + - The key in this map is the index of the request in the `requests` field in + the batch request. + - The value in each map entry **must** mirror the error(s) that would normally + be returned by the singular Standard Create method. + - If a failed request can eventually succeed due to server side retries, such + transient errors **must not** be communicated using `failed_requests`. + - When all requests in the batch fail, `Operation.error` **must** be set with + `code = google.rpc.Code.Aborted` and `message = "None of the requests + succeeded, refer to the BatchCreateBooksOperationMetadata.failed_requests + for individual error details"` +- The metadata message **may** include other fields to communicate the + operation progress. + +### Adopting Partial Success + +In order for an existing Batch API to adopt the partial success pattern, the API +must do the following: + +- The default behavior must be retained to avoid incompatible behavioral + changes. +- If the API returns an Operation: + - The request message **must** have a `bool return_partial_success` field. + - The Operation `metadata_type` **must** include a + `map failed_requests` field. + - When the `bool return_partial_success` field is set to true in a request, + the API should allow partial success behavior, otherwise it should continue + with atomic behavior as default. +- If the API returns a direct response synchronously: + - Since the existing clients will treat a success response as an atomic + operation, the existing version of the API **must not** adopt the partial + success pattern. + - A new version **must** be created instead that returns an Operation and + follows the partial success pattern described in this AIP. + +## Rationale + +### Restricting synchronous batch methods to be atomic + +The restriction that synchronous batch methods must be atomic is a result of +the following considerations. + +The previous iteration of this AIP recommended batch methods must be atomic. +There is no clear way to convey partial failure in a sync response status code +because an OK implies it all worked. Therefore, adding a new field to the +response to indicate partial failure would be a breaking change because the +existing clients would interpret an OK response as all resources created. + +On the other hand, as described in [AIP-193](https://aip.dev/193), Operations +are more capable of presenting partial states. The response status code for an +Operation does not convey anything about the outcome of the underlying operation +and a client has to check the response body to determine if the operation was +successful. + +### Communicating partial failures + +The AIP recommends using a `map failed_requests` field +to communicate partial failures, where the key is the index of the failed +request in the original batch request. The other options considered were: + +- A `repeated google.rpc.Status` field. This was rejected because it is not + clear which entry corresponds to which request. +- A `map` field, where the key is the request id of + the failed request. This was rejected because: + - Client will need to maintain a map of request_id -> request in order to use + the partial success response. + - Populating a request id for the purpose of communicating errors could + conflict with [AIP-155](https://aip.dev/155) if the service can not + guarantee idempotency for an individual request across multiple batch + requests. +- A `repeated FailedRequest` field, where FailedRequest contains the individual + create request and the `google.rpc.Status`. This was rejected because echoing + the request payload back in response is discouraged due to additional + challenges around user data sensitivity. + [aip-122-parent]: ./0122.md#fields-representing-a-resources-parent [request-message]: ./0133.md#request-message ## Changelog +- **2025-03-06**: Added detailed guidance for partial success behavior, and + decision framework for choosing between atomic and partial success - **2023-04-18**: Changed the recommendation to allow returning partial successes. - **2022-06-02**: Changed suffix descriptions to eliminate superfluous "-". diff --git a/aip/general/0234.md b/aip/general/0234.md index 72e3e80eb9..7a4410844b 100644 --- a/aip/general/0234.md +++ b/aip/general/0234.md @@ -15,7 +15,9 @@ transaction. A batch update method provides this functionality. ## Guidance -APIs **may** support Batch Update using the following pattern: +APIs **may** support Batch Update using the following two patterns: + +Returning the response synchronously ```proto rpc BatchUpdateBooks(BatchUpdateBooksRequest) returns (BatchUpdateBooksResponse) { @@ -26,23 +28,51 @@ rpc BatchUpdateBooks(BatchUpdateBooksRequest) returns (BatchUpdateBooksResponse) } ``` +Returning an Operation which resolves to the response asynchronously + +```proto +rpc BatchUpdateBooks(BatchUpdateBooksRequest) returns (google.longrunning.Operation) { + option (google.api.http) = { + post: "/v1/{parent=publishers/*}/books:batchUpdate" + body: "*" + }; + option (google.longrunning.operation_info) = { + response_type: "BatchUpdateBooksResponse" + metadata_type: "BatchUpdateBooksOperationMetadata" + }; +} +``` + - The RPC's name **must** begin with `BatchUpdate`. The remainder of the RPC name **should** be the plural form of the resource being updated. - The request and response messages **must** match the RPC name, with `Request` and `Response` suffixes. - - However, in the event that the request may take a significant amount of - time, the response message **must** be a `google.longrunning.Operation` - which ultimately resolves to the `Response` type. +- If the batch method returns an `google.longrunning.Operation`, both the + `response_type` and `metadata_type` fields **must** be specified. - The HTTP verb **must** be `POST`. - The HTTP URI **must** end with `:batchUpdate`. - The URI path **should** represent the collection for the resource, matching the collection used for simple CRUD operations. If the operation spans parents, a dash (`-`) **may** be accepted as a wildcard. - The body clause in the `google.api.http` annotation **should** be `"*"`. -- The operation **must** be atomic: it **must** fail for all resources or - succeed for all resources (no partial success). - - If the operation covers multiple locations and at least one location is - down, the operation **must** fail. + +### Atomic vs. Partial Success + +- The batch update method **may** support atomic (all resources updated or none + are) or partial success behavior. To make a choice, consider the following + factors: + - **Complexity of Ensuring Atomicity:** Operations that are simple + passthrough database transactions **should** use an atomic operation, + while operations that manage complex resources **should** use partial + success operations. + - **End-User Experience:** Consider the perspective of the API consumer. + Would atomic behavior be preferable for the given use case, even if it + means that a large batch could fail due to issues with a single or a few + entries? +- Synchronous batch update **must** be atomic. +- Asynchronous batch update **may** support atomic or partial success. + - If supporting partial success, see + [Operation metadata message](#operation-metadata-message) requirements. ### Request message @@ -107,11 +137,95 @@ message BatchUpdateBooksResponse { - The response message **must** include one repeated field corresponding to the resources that were updated. +### Operation metadata message + +- The `metadata_type` message **must** either match the RPC name with + `OperationMetadata` suffix, or be named with `Batch` prefix and + `OperationMetadata` suffix if the type is shared by multiple Batch methods. +- If batch update method supports partial success, the metadata message **must** + include a `map failed_requests` field to communicate + the partial failures. + - The key in this map is the index of the request in the `requests` field + in the batch request. + - The value in each map entry **must** mirror the error(s) that would normally + be returned by the singular Standard Update method. + - If a failed request can eventually succeed due to server side retries, such + transient errors **must not** be communicated using `failed_requests`. + - When all requests in the batch fail, `Operation.error` **must** be set with + `code = google.rpc.Code.Aborted` and `message = "None of the requests + succeeded, refer to the BatchUpdateBooksOperationMetadata.failed_requests + for individual error details"` +- The metadata message **may** include other fields to communicate the + operation progress. + +### Adopting Partial Success + +In order for an existing Batch API to adopt the partial success pattern, the API +must do the following: + +- The default behavior must be retained to avoid incompatible behavioral + changes. +- If the API returns an Operation: + - The request message **must** have a `bool return_partial_success` field. + - The Operation `metadata_type` **must** include a + `map failed_requests` field. + - When the `bool return_partial_success` field is set to true in a request, + the API should allow partial success behavior, otherwise it should continue + with atomic behavior as default. +- If the API returns a direct response synchronously: + - Since the existing clients will treat a success response as an atomic + operation, the existing version of the API **must not** adopt the partial + success pattern. + - A new version **must** be created instead that returns an Operation and + follows the partial success pattern described in this AIP. + +## Rationale + +### Restricting synchronous batch methods to be atomic + +The restriction that synchronous batch methods must be atomic is a result of +the following considerations. + +The previous iteration of this AIP recommended batch methods must be atomic. +There is no clear way to convey partial failure in a sync response status code +because an OK implies it all worked. Therefore, adding a new field to the +response to indicate partial failure would be a breaking change because the +existing clients would interpret an OK response as all resources updated. + +On the other hand, as described in [AIP-193](https://aip.dev/193), Operations +are more capable of presenting partial states. The response status code for an +Operation does not convey anything about the outcome of the underlying operation +and a client has to check the response body to determine if the operation was +successful. + +### Communicating partial failures + +The AIP recommends using a `map failed_requests` field +to communicate partial failures, where the key is the index of the failed +request in the original batch request. The other options considered were: + +- A `repeated google.rpc.Status` field. This was rejected because it is not + clear which entry corresponds to which request. +- A `map` field, where the key is the request id of + the failed request. This was rejected because: + - Client will need to maintain a map of request_id -> request in order to use + the partial success response. + - Populating a request id for the purpose of communicating errors could + conflict with [AIP-155](https://aip.dev/155) if the service can not + guarantee idempotency for an individual request across multiple batch + requests. +- A `repeated FailedRequest` field, where FailedRequest contains the individual + update request and the `google.rpc.Status`. This was rejected because echoing + the request payload back in response is discouraged due to additional + challenges around user data sensitivity. + [aip-122-parent]: ./0122.md#fields-representing-a-resources-parent [request-message]: ./0134.md#request-message ## Changelog +- **2025-03-06**: Changed recommendation to allow partial success, along with + detailed guidance - **2022-06-02:** Changed suffix descriptions to eliminate superfluous "-". - **2020-09-16**: Suggested annotating `parent` and `requests` fields. - **2020-08-27**: Removed parent recommendations for top-level resources. diff --git a/aip/general/0235.md b/aip/general/0235.md index 16e526d445..e38609443b 100644 --- a/aip/general/0235.md +++ b/aip/general/0235.md @@ -15,7 +15,9 @@ transaction. A batch delete method provides this functionality. ## Guidance -Batch delete methods are specified using the following pattern: +APIs **may** support Batch Delete using the following two patterns: + +Returning the response synchronously ```proto rpc BatchDeleteBooks(BatchDeleteBooksRequest) returns (google.protobuf.Empty) { @@ -26,25 +28,55 @@ rpc BatchDeleteBooks(BatchDeleteBooksRequest) returns (google.protobuf.Empty) { } ``` +Returning an Operation which resolves to the response asynchronously + +```proto +rpc BatchDeleteBooks(BatchDeleteBooksRequest) returns (google.longrunning.Operation) { + option (google.api.http) = { + post: "/v1/{parent=publishers/*}/books:batchDelete" + body: "*" + }; + option (google.longrunning.operation_info) = { + response_type: "google.protobuf.Empty" + metadata_type: "BatchDeleteBooksOperationMetadata" + }; +} +``` + - The RPC's name **must** begin with `BatchDelete`. The remainder of the RPC name **should** be the plural form of the resource being deleted. - The request message **must** match the RPC name, with a `Request` suffix. - The response message **should** be `google.protobuf.Empty`. - If the resource is [soft deleted][soft-delete], the response message **should** be a response message containing the updated resources. - - In the event that the request may take a significant amount of time, the - response message **must** be a `google.longrunning.Operation` which - resolves to the correct response. +- If the batch method returns an `google.longrunning.Operation`, both the + `response_type` and `metadata_type` fields **must** be specified. + - If the resource is [soft deleted][soft-delete], the `response_type` + **should** be a response message containing the updated resources. - The HTTP verb **must** be `POST` (not `DELETE`). - The HTTP URI **must** end with `:batchDelete`. - The URI path **should** represent the collection for the resource, matching the collection used for simple CRUD operations. If the operation spans parents, a dash (`-`) **may** be accepted as a wildcard. - The body clause in the `google.api.http` annotation **should** be `"*"`. -- The operation **should** be atomic: it **should** fail for all resources or - succeed for all resources (no partial success). - - If the operation covers multiple locations and at least one location is - down, the operation **must** fail. + +### Atomic vs. Partial Success + +- The batch delete method **may** support atomic (all resources deleted or none + are) or partial success behavior. To make a choice, consider the following + factors: + - **Complexity of Ensuring Atomicity:** Operations that are simple + passthrough database transactions **should** use an atomic operation, + while operations that manage complex resources **should** use partial + success operations. + - **End-User Experience:** Consider the perspective of the API consumer. + Would atomic behavior be preferable for the given use case, even if it + means that a large batch could fail due to issues with a single or a few + entries? +- Synchronous batch delete **must** be atomic. +- Asynchronous batch delete **may** support atomic or partial success. + - If supporting partial success, see + [Operation metadata message](#operation-metadata-message) requirements. ### Request message @@ -170,6 +202,88 @@ message BatchDeleteBooksResponse { - The response message **must** include one repeated field corresponding to the resources that were soft-deleted. +### Operation metadata message + +- The `metadata_type` message **must** either match the RPC name with + `OperationMetadata` suffix, or be named with `Batch` prefix and + `OperationMetadata` suffix if the type is shared by multiple Batch methods. +- If batch delete method supports partial success, the metadata message **must** + include a `map failed_requests` field to communicate + the partial failures. + - The key in this map is the index of the request in the `requests` field in + the batch request. + - The value in each map entry **must** mirror the error(s) that would normally + be returned by the singular Standard Delete method. + - If a failed request can eventually succeed due to server side retries, such + transient errors **must not** be communicated using `failed_requests`. + - When all requests in the batch fail, `Operation.error` **must** be set with + `code = google.rpc.Code.Aborted` and `message = "None of the requests + succeeded, refer to the BatchDeleteBooksOperationMetadata.failed_requests + for individual error details"` +- The metadata message **may** include other fields to communicate the + operation progress. + +### Adopting Partial Success + +In order for an existing Batch API to adopt the partial success pattern, the API +must do the following: + +- The default behavior must be retained to avoid incompatible behavioral + changes. +- If the API returns an Operation: + - The request message **must** have a `bool return_partial_success` field. + - The Operation `metadata_type` **must** include a + `map failed_requests` field. + - When the `bool return_partial_success` field is set to true in a request, + the API should allow partial success behavior, otherwise it should continue + with atomic behavior as default. +- If the API returns a direct response synchronously: + - Since the existing clients will treat a success response as an atomic + operation, the existing version of the API **must not** adopt the partial + success pattern. + - A new version **must** be created instead that returns an Operation and + follows the partial success pattern described in this AIP. + +## Rationale + +### Restricting synchronous batch methods to be atomic + +The restriction that synchronous batch methods must be atomic is a result of +the following considerations. + +The previous iteration of this AIP recommended batch methods must be atomic. +There is no clear way to convey partial failure in a sync response status code +because an OK implies it all worked. Therefore, adding a new field to the +response to indicate partial failure would be a breaking change because the +existing clients would interpret an OK response as all resources created. + +On the other hand, as described in [AIP-193](https://aip.dev/193), Operations +are more capable of presenting partial states. The response status code for an +Operation does not convey anything about the outcome of the underlying operation +and a client has to check the response body to determine if the operation was +successful. + +### Communicating partial failures + +The AIP recommends using a `map failed_requests` field +to communicate partial failures, where the key is the index of the failed +request in the original batch request. The other options considered were: + +- A `repeated google.rpc.Status` field. This was rejected because it is not + clear which entry corresponds to which request. +- A `map` field, where the key is the request id of + the failed request. This was rejected because: + - Client will need to maintain a map of request_id -> request in order to use + the partial success response. + - Populating a request id for the purpose of communicating errors could + conflict with [AIP-155](https://aip.dev/155) if the service can not + guarantee idempotency for an individual request across multiple batch + requests. +- A `repeated FailedRequest` field, where FailedRequest contains the individual + create request and the `google.rpc.Status`. This was rejected because echoing + the request payload back in response is discouraged due to additional + challenges around user data sensitivity. + [aip-122-names]: ./0122.md#fields-representing-resource-names [aip-122-parent]: ./0122.md#fields-representing-a-resources-parent [aip-165]: ./0165.md @@ -178,6 +292,8 @@ message BatchDeleteBooksResponse { ## Changelog +- **2025-03-06**: Changed recommendation to allow partial success, along with + detailed guidance - **2022-06-02:** Changed suffix descriptions to eliminate superfluous "-". - **2020-09-16**: Suggested annotating `parent`, `names`, and `requests` fields. - **2020-08-27**: Removed parent recommendations for top-level resources.