Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add syncing_status query parameter to eth/v1/node/health #5790

Merged
merged 12 commits into from
Aug 1, 2023
20 changes: 14 additions & 6 deletions packages/api/src/beacon/routes/node.ts
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,10 @@ export enum NodeHealth {
NOT_INITIALIZED_OR_ISSUES = HttpStatusCode.SERVICE_UNAVAILABLE,
}

export type NodeHealthOptions = {
syncingStatus?: number;
};

/**
* Read information about the beacon node.
*/
Expand Down Expand Up @@ -120,13 +124,13 @@ export type Api = {
/**
* Get health check
* Returns node health status in http status codes. Useful for load balancers.
*
* NOTE: This route does not return any value
*/
getHealth(): Promise<
getHealth(
options?: NodeHealthOptions
): Promise<
ApiClientResponse<
{[HttpStatusCode.OK]: void; [HttpStatusCode.PARTIAL_CONTENT]: void},
HttpStatusCode.SERVICE_UNAVAILABLE
HttpStatusCode.BAD_REQUEST | HttpStatusCode.SERVICE_UNAVAILABLE
>
>;
};
Expand All @@ -150,7 +154,7 @@ export type ReqTypes = {
getPeerCount: ReqEmpty;
getNodeVersion: ReqEmpty;
getSyncingStatus: ReqEmpty;
getHealth: ReqEmpty;
getHealth: {query: {syncing_status?: number}};
};

export function getReqSerializers(): ReqSerializers<Api, ReqTypes> {
Expand All @@ -171,7 +175,11 @@ export function getReqSerializers(): ReqSerializers<Api, ReqTypes> {
getPeerCount: reqEmpty,
getNodeVersion: reqEmpty,
getSyncingStatus: reqEmpty,
getHealth: reqEmpty,
getHealth: {
writeReq: (options) => ({query: {syncing_status: options?.syncingStatus}}),
parseReq: ({query}) => [{syncingStatus: query.syncing_status}],
schema: {query: {syncing_status: Schema.Uint}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend moving the range check in the schema.

Copy link
Member Author

@nflaig nflaig Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example API that does this? Based on current types this is not possible as far as I can see, we only support pre-defined schemas

export enum Schema {

We enforce basic validation through schema but the error response and logging is not ideal in my opinion

JSON API response

Schema validation error

[
  {
    "instancePath": "/syncing_status",
    "schemaPath": "#/properties/syncing_status/minimum",
    "keyword": "minimum",
    "params": {
      "comparison": ">=",
      "limit": 0
    },
    "message": "must be >= 0"
  }
]

API error

{
  "statusCode": 400,
  "error": "Bad Request",
  "message": "Invalid syncing status code: 600"
}

Error log

Schema validation error

Eph 0/2 1.126[rest]            error: Req req-o getHealth error  querystring/syncing_status must be >= 0
Error: querystring/syncing_status must be >= 0
    at defaultSchemaErrorFormatter (/home/nico/projects/ethereum/lodestar/node_modules/fastify/lib/context.js:108:10)
    at wrapValidationError (/home/nico/projects/ethereum/lodestar/node_modules/fastify/lib/validation.js:228:17)
    at validate (/home/nico/projects/ethereum/lodestar/node_modules/fastify/lib/validation.js:157:16)
    at preValidationCallback (/home/nico/projects/ethereum/lodestar/node_modules/fastify/lib/handleRequest.js:92:25)
    at handler (/home/nico/projects/ethereum/lodestar/node_modules/fastify/lib/handleRequest.js:76:7)
    at handleRequest (/home/nico/projects/ethereum/lodestar/node_modules/fastify/lib/handleRequest.js:24:5)
    at runPreParsing (/home/nico/projects/ethereum/lodestar/node_modules/fastify/lib/route.js:569:5)
    at next (/home/nico/projects/ethereum/lodestar/node_modules/fastify/lib/hooks.js:179:7)
    at handleResolve (/home/nico/projects/ethereum/lodestar/node_modules/fastify/lib/hooks.js:196:5)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)

API error

Eph 0/0 5.017[rest]             warn: Req req-9 getHealth failed reason=Invalid syncing status code: 600

In both cases as it is right now, it is better to use API error. Schema validation could be made more flexible but there are not that many values in the spec that enforce things like minimum / maximum.

},
};
}

Expand Down
25 changes: 1 addition & 24 deletions packages/api/src/beacon/server/node.ts
Original file line number Diff line number Diff line change
@@ -1,32 +1,9 @@
import {ChainForkConfig} from "@lodestar/config";
import {Api, ReqTypes, routesData, getReturnTypes, getReqSerializers} from "../routes/node.js";
import {ServerRoutes, getGenericJsonServer} from "../../utils/server/index.js";
import {HttpStatusCode} from "../../utils/client/httpStatusCode.js";
import {ServerApi} from "../../interfaces.js";

export function getRoutes(config: ChainForkConfig, api: ServerApi<Api>): ServerRoutes<ServerApi<Api>, ReqTypes> {
// All routes return JSON, use a server auto-generator
const serverRoutes = getGenericJsonServer<ServerApi<Api>, ReqTypes>(
{routesData, getReturnTypes, getReqSerializers},
config,
api
);

return {
...serverRoutes,

getHealth: {
...serverRoutes.getHealth,
handler: async (req, res) => {
try {
await api.getHealth();
res.raw.writeHead(HttpStatusCode.OK);
res.raw.end();
} catch (e) {
res.raw.writeHead(HttpStatusCode.INTERNAL_SERVER_ERROR);
res.raw.end();
}
},
},
};
return getGenericJsonServer<ServerApi<Api>, ReqTypes>({routesData, getReturnTypes, getReqSerializers}, config, api);
}
6 changes: 3 additions & 3 deletions packages/api/src/interfaces.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ export type APIClientHandler = (...args: any) => PromiseLike<ApiClientResponse>;
export type APIServerHandler = (...args: any) => PromiseLike<unknown>;

export type ApiClientSuccessResponse<S extends keyof any, T> = {ok: true; status: S; response: T; error?: never};
export type ApiClientSErrorResponse<S extends Exclude<HttpStatusCode, HttpSuccessCodes>> = {
export type ApiClientErrorResponse<S extends Exclude<HttpStatusCode, HttpSuccessCodes>> = {
ok: false;
status: S;
response?: never;
Expand All @@ -18,8 +18,8 @@ export type ApiClientResponse<
E extends Exclude<HttpStatusCode, HttpSuccessCodes> = Exclude<HttpStatusCode, HttpSuccessCodes>,
> =
| {[K in keyof S]: ApiClientSuccessResponse<K, S[K]>}[keyof S]
| {[K in E]: ApiClientSErrorResponse<K>}[E]
| ApiClientSErrorResponse<HttpStatusCode.INTERNAL_SERVER_ERROR>;
| {[K in E]: ApiClientErrorResponse<K>}[E]
| ApiClientErrorResponse<HttpStatusCode.INTERNAL_SERVER_ERROR>;

export type ApiClientResponseData<T extends ApiClientResponse> = T extends {ok: true; response: infer R} ? R : never;

Expand Down
9 changes: 6 additions & 3 deletions packages/api/src/utils/client/client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -80,14 +80,17 @@ export function generateGenericJsonClient<
}
} catch (err) {
if (err instanceof HttpError) {
return {ok: false, error: {code: err.status, message: err.message, operationId: routeId}} as ReturnType<
Api[keyof Api]
>;
return {
ok: false,
status: err.status,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this was just missed previously as the code here is not really type-safe, didn't find a quick solution to fix that but we should always set status in client response.

error: {code: err.status, message: err.message, operationId: routeId},
} as ReturnType<Api[keyof Api]>;
}

if (err instanceof TimeoutError) {
return {
ok: false,
status: HttpStatusCode.INTERNAL_SERVER_ERROR,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This status code is kinda wrong but 408 would also be wrong as it would indicate a server timeout whereas this error happens due to client timeout.

If server doesn't respond within client timeout we can assume a server issue, setting 500 here is the most accurate. Alternative would be to just throw the error here but this makes the error logs less informative if a timeout happens.

error: {code: HttpStatusCode.INTERNAL_SERVER_ERROR, message: err.message, operationId: routeId},
} as ReturnType<Api[keyof Api]>;
}
Expand Down
2 changes: 1 addition & 1 deletion packages/api/src/utils/server/genericJsonServer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ export function getGenericJsonServer<

handler: async function handler(this: FastifyInstance, req, resp): Promise<unknown | void> {
const args: any[] = routeSerdes.parseReq(req as ReqGeneric as ReqTypes[keyof Api]);
const data = (await api[routeId](...args)) as Resolves<Api[keyof Api]>;
const data = (await api[routeId](...args, req, resp)) as Resolves<Api[keyof Api]>;
nflaig marked this conversation as resolved.
Show resolved Hide resolved

if (routeDef.statusOk !== undefined) {
resp.statusCode = routeDef.statusOk;
Expand Down
2 changes: 1 addition & 1 deletion packages/api/test/unit/beacon/testData/node.ts
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ export const testData: GenericServerTestCases<Api> = {
res: {data: {headSlot: "1", syncDistance: "2", isSyncing: false, isOptimistic: true, elOffline: false}},
},
getHealth: {
args: [],
args: [{syncingStatus: 206}],
res: undefined,
},
};
12 changes: 11 additions & 1 deletion packages/api/test/utils/genericServerTest.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import {ServerResponse} from "node:http";
import {FastifyReply} from "fastify";
import {expect} from "chai";
import {ChainForkConfig} from "@lodestar/config";
import {ReqGeneric, Resolves} from "../../src/utils/index.js";
Expand Down Expand Up @@ -58,7 +60,15 @@ export function runGenericServerTest<

// Assert server handler called with correct args
expect(mockApi[routeId].callCount).to.equal(1, `mockApi[${routeId as string}] must be called once`);
expect(mockApi[routeId].getCall(0).args).to.deep.equal(testCase.args, `mockApi[${routeId as string}] wrong args`);
let {args} = mockApi[routeId].getCall(0);
// Default handler injects request and response as last two args
const lastArg = args[args.length - 1];
// Detect if last arg is response as custom handlers might not inject it
if (typeof lastArg === "object" && (lastArg as FastifyReply).raw instanceof ServerResponse) {
// Remove request and response before assertion
args = args.slice(0, -2) as typeof args;
}
expect(args).to.deep.equal(testCase.args, `mockApi[${routeId as string}] wrong args`);

// Assert returned value is correct
expect(res.response).to.deep.equal(testCase.res, "Wrong returned value");
Expand Down
12 changes: 10 additions & 2 deletions packages/beacon-node/src/api/impl/node/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,18 @@ export function getNodeApi(
return {data: sync.getSyncStatus()};
},

async getHealth(_req, res) {
async getHealth(options, _req, res) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhmm I didn't noticed that request and response objects where being leaked to the API implementation. I see it was introduced by #4994 but why not keep the interaction in the Fastify's response object in packages/api/src/beacon/server/node.ts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at this as well but the only "clean way" to achieve this without too many type hacks is approach used in #3349 but in this case the getHealth type is incorrect, it doesn't actually return a res.response of type number.

If you think passing response object to API implementation is not desired we should update types accordingly and use the other approach to implementing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nazarhussain any opinions here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing res directly to API handler is a pattern which does not meant it's leaking anything.

We can define a data structure for the implementation of API endpoints, that way we can handle the status code at the server implementation side.

So all implementation handlers should return;

{
   body: unknown;
   status?: number; 
}

And if status is not present then we consider 200 as default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Health API is really the only one right now that needs to modify the status code.

The interface of response is limited to setting the status code

export type GenericResponseObject = {code: (code: number) => void};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case we can make the response status code structure as optional. e.g.

() => T | {body: T, status; number}

This will require no refactor on API implementation side and minimum change on the server side.

Copy link
Member Author

@nflaig nflaig Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, that could be a good pattern to give some API implementations more control over the response. Maybe we want to allow to set null explicitly as body as well, that way we can more closely follow the spec for APIs that should not return a json body (see #5790 (comment)).

Proposed type would look like this then

() => T | {body: T | null, status: number}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So will you make the change in this PR or separate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to do it in this PR, allows to remove the workaround in tests, should reduce overall diff

Copy link
Member Author

@nflaig nflaig Jul 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request and response objects are no longer passed to API implementation. After looking into updating return type to something like () => T | {body: T | null, status: number} it does not look that clean and requires too many type checks in other place unrelated to health API.

Better to address in a separate PR.

// Custom value passed via `syncing_status` query parameter
const syncingStatus = options?.syncingStatus;

// Syncing status must be a valid HTTP status code within range 100-599
if (syncingStatus != null && (syncingStatus < 100 || syncingStatus > 599)) {
throw new ApiError(400, `Invalid syncing status code: ${syncingStatus}`);
}

if (sync.getSyncStatus().isSyncing) {
// 206: Node is syncing but can serve incomplete data
res?.code(routes.node.NodeHealth.SYNCING);
res?.code(syncingStatus ?? routes.node.NodeHealth.SYNCING);
} else {
// 200: Node is ready
res?.code(routes.node.NodeHealth.READY);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ describe("beacon node api", function () {
chain: {blsVerifyAllMainThread: true},
},
validatorCount,
logger: testLogger("Node-A", {level: LogLevel.info}),
logger: testLogger("Node-Synced", {level: LogLevel.info}),
});
client = getClient({baseUrl: `http://127.0.0.1:${restPort}`}, {config});
});
Expand Down Expand Up @@ -65,9 +65,8 @@ describe("beacon node api", function () {
});

it("should return 'el_offline' as 'true' when EL not available", async () => {
// Close first instance
await bn.close();
bn = await getDevBeaconNode({
const portElOffline = 9597;
const bnElOffline = await getDevBeaconNode({
params: {
...chainConfigDef,
// eslint-disable-next-line @typescript-eslint/naming-convention
Expand All @@ -82,18 +81,19 @@ describe("beacon node api", function () {
api: {
rest: {
enabled: true,
port: restPort,
port: portElOffline,
},
},
chain: {blsVerifyAllMainThread: true},
},
validatorCount: 5,
logger: testLogger("Node-A", {level: LogLevel.info}),
logger: testLogger("Node-EL-Offline", {level: LogLevel.info}),
});
const clientElOffline = getClient({baseUrl: `http://127.0.0.1:${portElOffline}`}, {config});

// To make BN communicate with EL, it needs to produce some blocks and for that need validators
const {validators} = await getAndInitDevValidators({
node: bn,
node: bnElOffline,
validatorClientCount: 1,
validatorsPerClient: validatorCount,
startIndex: 0,
Expand All @@ -102,12 +102,75 @@ describe("beacon node api", function () {
// Give node sometime to communicate with EL
await sleep(chainConfigDef.SECONDS_PER_SLOT * 2 * 1000);

const res = await client.node.getSyncingStatus();
const res = await clientElOffline.node.getSyncingStatus();
ApiError.assert(res);

expect(res.response.data.elOffline).to.eql(true);

await Promise.all(validators.map((v) => v.close()));
await bnElOffline.close();
});
});

describe("getHealth", () => {
const portSyncing = 9598;

let bnSyncing: BeaconNode;
let clientSyncing: Api;

before(async () => {
bnSyncing = await getDevBeaconNode({
params: chainConfigDef,
options: {
// Node won't consider itself synced without peers
sync: {isSingleNode: false},
api: {
rest: {
enabled: true,
port: portSyncing,
},
},
chain: {blsVerifyAllMainThread: true},
},
validatorCount,
logger: testLogger("Node-Syncing", {level: LogLevel.info}),
});
clientSyncing = getClient({baseUrl: `http://127.0.0.1:${portSyncing}`}, {config});
// Must at least wait for one slot else node considers itself synced pre/at genesis
await sleep(chainConfigDef.SECONDS_PER_SLOT * 1000);
});

after(async () => {
await bnSyncing.close();
});

it("should return 200 status code if node is ready", async () => {
const res = await client.node.getHealth();
expect(res.status).to.equal(200);
});

it("should return 206 status code if node is syncing", async () => {
const res = await clientSyncing.node.getHealth();
expect(res.status).to.equal(206);
});

it("should return custom status code from 'syncing_status' query parameter if node is syncing", async () => {
const statusCode = 204;
const res = await clientSyncing.node.getHealth({syncingStatus: statusCode});
expect(res.status).to.equal(statusCode);
});

it("should only use status code from 'syncing_status' query parameter if node is syncing", async () => {
const res = await client.node.getHealth({syncingStatus: 204});
expect(res.status).to.equal(200);
});

it("should return 400 status code if value of 'syncing_status' query parameter is invalid", async () => {
const res = await clientSyncing.node.getHealth({syncingStatus: 99});
expect(res.status).to.equal(400);

const resp = await clientSyncing.node.getHealth({syncingStatus: 600});
expect(resp.status).to.equal(400);
});
});
});
Loading