
Coordinate microservices: Error Handling in AWS Step Functions
微服务如何抛出异常给 AWS Step Functions ? How microservices “throws” exceptions to AWS Step Functions ?
在 开发微服务: AWS Step Functions (Orchestrate) 的一文中, 我们使用 AWS Step Functions, 以 Orchestrate 的方式, 使得数个的微服务可实现如下的工作流:
In the article; Development microservices: AWS Step Functions (Orchestrate), we use AWS Step Functions in an Orchestrate manner so that several microservices can implement the following workflow:

上图所示的工作流, 当客户提交订单的时候, “预订酒店” 一定要先确定是成功的, 才会执行 “预订旅游景点的门票”。
In the workflow shown in the image above, when a customer submits an order, the “reservation hotel” must be determined to be successful before it executes “Book tickets to tourist attractions.”
上图所示的工作流, 假如, 预订酒店时发生网络无法连接服务器, 或是服务器当机 、酒店已客满、预订旅游景点的门票失败…等等; 我们将如何的能补捉得到这些林林总总的异常, 并且能做出适当的处置 ?
The workflow shown in the figure above, suppose when booking a hotel, the network cannot connect to the server or the server crashes, the hotel is entire, the ticket reservation for the tourist attraction fails, etc.; How can we catch these anomalies with them appropriately?
在本文中, 我们将探讨 AWS Step Functions 如何的补捉与处置异常的 ?
This article will explore how AWS Step Functions can catch up and handle exceptions.
首先, 我们要先来看看微服务的 Lambda Functions 是如何的将异常 “抛 (throw)” 给 AWS Step Functions 的 ?
First, let’s look at how Lambda Functions of microservices “throws” exceptions to AWS Step Functions ?
- 输入的值错误的异常; response.status === 418
- An exception entered with an incorrect value; response.status === 418
输入的值错误的异常; InvalidInputError; 以一个继承自 “Error” 的类来实现; 代码如下:
An exception for the input value error; InvalidInputError; Implemented with a class that inherits from “Error”; The code is as follows:
class InvalidInputError extends Error {
constructor(message) {
super(message);
this.name = 'InvalidInputError';
// clean up the stack trace
if (Error.captureStackTrace) {
Error.captureStackTrace(this, InvalidInputError);
}
}
}
- 网络异常或服务器无法提供服务; response.status === 503
- Network exception or server unable to provide service; status === 503
网络异常或服务器无法提供服务; TransientError; 以一个继承自 “Error” 的类来实现; 代码如下:
Network exception or server unable to provide service; TransientError; Implemented with a class that inherits from “Error”; The code is as follows:
class TransientError extends Error {
constructor(message) {
super(message);
this.name = 'TransientError';
// clean up the stack trace
if (Error.captureStackTrace) {
Error.captureStackTrace(this, TransientError);
}
}
}
- 根据 HTTP status codes, 决定微服务的 Lambda Functions 的响应:
- According to http status codes, determines the response of the microservice’s Lambda Functions:
- 当 HTTP status code 是 200 (response.ok), 便回传 JSON 给 AWS Step Functions。
- When the HTTP status code is 200 (response. ok), the JSON passes back to AWS Step Functions.
- 当 HTTP status code 是 418 (response.status === 418), 便 “抛 (throw)” InvalidInputError 给 AWS Step Functions。
- When the HTTP status code is 418 (response.status === 418), it “throws” InvalidInputError to AWS Step Functions。
- 当 HTTP status code是 503 (response.status === 503), 便 “抛 (throw)” TransientError 给 AWS Step Functions。
- When the HTTP status code is 503 (response.status === 503), transientError is given to AWS Step Functions。
代码如下:
The code is as follows:
async function checkResponseStatus(response) {
if(response.ok) {
return response
} else if(response.status === 418) {
throw new InvalidInputError(`The HTTP status of the response is ${response.status} ; Invalid Input Error`);
} else if (response.status === 503) {
throw new TransientError(`The HTTP status of the response is ${response.status} ; Transient Error`);
} else {
throw new Error("There was an unknown error with hotel booking process.");
}
}
AWS Step Functions 补捉与处置异常 AWS Step Functions catches and handles exceptions
在这篇文章中, AWS Step Functions 将必需能区分出异常是来自于: TransientError 或是 InvalidInputError ?
In this article, AWS Step Functions will have to distinguish whether the exception is from TransientError or InvalidInputError?
TransientError (response.status === 503)
- 当异常是来自于: TransientError (response.status === 503); 表示著是网络的异常或是服务器无法提供服务。
- When the exception is from TransientError (response. status === 503 ), it Indicates an anomaly in the network or a server that cannot provide service.
- AWS Step Functions 处置异常 TransientError (response.status === 503) 的方式, 是要让微服务的 Lambda Functions “重试”。
- The way AWS Step Functions handles the exception TransientError (response.status === 503) is to let Lambda Functions of microservices “Retry“.
AWS Step Functions 处置异常 TransientError (response.status === 503) 的代码如下:
Aws Step Functions handles the code for the exception TransientError (response.status === 503) as follows:
"Retry": [
{
"ErrorEquals": [
"TransientError"
],
"MaxAttempts": 3
}
]
- AWS Step Functions 以 “Retry” 使得微服务的 Lambda Functions 可以 “重试” “MaxAttempts”: 3次。
- AWS Step Functions uses “Retry“ to enable Lambda Functions for microservices to “retry” “MaxAttempts”: 3 times.
- AWS Step Functions 以 “ErrorEquals” 来过滤需处置的异常; 只有当异常是 “等于” “TransientError” 的时候, 才会以让微服务的 Lambda Functions “重试” 的方式来处置异常。
- AWS Step Functions uses “ErrorEquals” to filter for exceptions that need to handle; Only if the exception is “ equals “ to “TransientError” will the Lambda Functions of microservices use the “retry” way to manage the exception.
InvalidInputError (response.status === 418)
- 当异常是来自于: InvalidInputError (status === 418), 表示著是输入的值错误的异常。
- When the exception is from : InvalidInputError (status === 418), it indicates that the input value is incorrect.
- AWS Step Functions 以 “Catch” , “ErrorEquals”, 来补捉异常 InvalidInputError (status === 418)。
- AWS Step Functions catches the exception InvalidInputError (response.status === 418) with “Catch”, “ErrorEquals”.
- AWS Step Functions 处置异常 InvalidInputError (response.status === 418) 的方式, 是要以一个 “fail” 的状态, 来做出相对应的适当处置。
- AWS Step Functions handles the exception InvalidInputError (response. status === 418) so that it is in a “fail” state to make a corresponding appropriate disposition.
AWS Step Functions 处置异常 InvalidInputError (response.status === 418) 的方式的代码如下:
The code for aws step functions to handle the exception InvalidInputError (response.status === 418) is as follows:
"Catch": [
{
"ErrorEquals": [
"InvalidInputError"
],
"Next": "BookHotelInvalidInputError"
},
"BookHotelInvalidInputError": {
"Type": "Fail",
"Error": "InvalidInputError",
"Cause": "This is a fallback from a BookHotel Lambda function exception; Invalid Input Error."
},
异常是来自于微服务; BookMuseumClient
Exceptions are from microservices; BookMuseumClient
- 当 AWS Step Functions 以 “Catch“, “ErrorEquals“, “States.ALL“, 补捉到任何来自于微服务; BookMuseumClient 的异常时, AWS Step Functions便会进入状态 “CancelHotelState“。
- When AWS Step Functions takes “Catch,” “ErrorEquals,” “States. ALL“, catches any exceptions from microservices; BookMuseumClient, AWS Step Functions enters the status “CancelHotelState“.
- 状态 “CancelHotelState” 则会调用微服务; CancelHotelServer。
- The status “CancelHotelState” will call the microservice; CancelHotelServer。
- 微服务; CancelHotelServer 会将先前所预订的酒店给取消。
- Microservices; CancelHotelServer will cancel previously booked hotels.
相关的 AWS Step Functions 代码如下:
The relevant AWS Step Functions code is as follows:
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"Next": "CancelHotelState"
}
]
"CancelHotelState": {
"Type": "Task",
"Next": "BookMuseumAllFallback",
"Resource": "arn:aws:lambda:region:accountId:function:CancelHotelServer"
},
"BookMuseumAllFallback": {
"Type": "Fail",
"Error": "States.ALL",
"Cause": "This is a fallback from any error code in BookMuseum Lambda function."
}
AWS Step Functions 补捉与处置异常的状态图如下:
The status diagram of AWS Step Functions catching and disposing of exceptions is as follows:

A. 测试 AWS Step Functions 补捉与处置异常的输入如下:
A. The inputs to test the AWS Step Functions catch and handle exceptions are as follows:
Note: 预订 hotel 的 end_date 输入错误
Note: The end_date booking the hotel enter incorrectly.
{
"purchase": {
"buyer_id": "mariano"
},
"hotel": {
"start_date": "2020-03-13",
"end_date": "20-03-15"
},
"museum": {
"museum_name": "tate gallery",
"when": "2020-03-14"
}
}
- 测试的结果如下:
-
The results of the test are as follows:

B. 测试 AWS Step Functions 补捉与处置异常的输入如下:
B. The inputs to test the AWS Step Functions catch and handle exceptions are as follows:
Note: 预订 museum 的 when 输入错误
Note: When booking the museum, I entered it incorrectly
{
"purchase": {
"buyer_id": "mariano"
},
"hotel": {
"start_date": "2020-03-13",
"end_date": "2020-03-15"
},
"museum": {
"museum_name": "tate gallery",
"when": "20-03-14"
}
}
- 测试的结果如下:
-
The results of the test are as follows:

AWS Step Functions 补捉与处置异常的完整代码, 请参考:
For the complete code for AWS Step Functions to catch and handle exceptions, see :
https://github.com/KenFang/handle_error_book_workflow