[聚合文章] Koa 请求日志打点工具（二）——使用 async_hooks

原文链接: https://zhuanlan.zhihu.com/p/32984398

Koa 请求日志打点工具（二）——使用 async_hooks

10 hours ago

我在一年前写过一篇《Koa 请求日志打点工具》，公司内部一直使用至今，也定位过不少问题。但 koa-yield-breakpoint 仍然有一个很大的缺憾，即：无法记录除 controller 或者 router 之外的函数的执行时间（因为获取不到当前请求的 "this"）。举个更通俗的例子：我在一个路由的 controller 里面调用了 A ，A 调用了 B ，B 又调用了 C...这是我们非常常见的用法，但之前使用 koa-yield-breakpoint 只能获取 A 的执行时间，无法获取 B 和 C 的执行时间。

根本原因在于：无法知道函数之间的调用关系，即 B 不知道是 A 调用的它，即使知道也不知道是哪次请求到来时执行的 A 调用的它。

但是！

node@8.1 引入了一个黑魔法——Async Hooks。

Async Hooks

我们先看下 node@8.1 新加入的 async_hooks 是个什么东西。官网介绍：

The async_hooks module provides an API to register callbacks tracking the lifetime of asynchronous resources created inside a Node.js application.

一句话概括：async_hooks 用来追踪 Node.js 中异步资源的生命周期。

我们来看个例子：

const fs = require('fs')const async_hooks = require('async_hooks')async_hooks.createHook({  init (asyncId, type, triggerAsyncId, resource) {    fs.writeSync(1, `${type}(${asyncId}): trigger: ${triggerAsyncId}\n`)  },  destroy (asyncId) {    fs.writeSync(1, `destroy: ${asyncId}\n`);  }}).enable()async function A () {  fs.writeSync(1, `A -> ${async_hooks.executionAsyncId()}\n`)  setTimeout(() => {    fs.writeSync(1, `A in setTimeout -> ${async_hooks.executionAsyncId()}\n`)    B()  })}async function B () {  fs.writeSync(1, `B -> ${async_hooks.executionAsyncId()}\n`)  process.nextTick(() => {    fs.writeSync(1, `B in process.nextTick -> ${async_hooks.executionAsyncId()}\n`)    C()    C()  })}function C () {  fs.writeSync(1, `C -> ${async_hooks.executionAsyncId()}\n`)  Promise.resolve().then(() => {    fs.writeSync(1, `C in promise.then -> ${async_hooks.executionAsyncId()}\n`)  })}fs.writeSync(1, `top level -> ${async_hooks.executionAsyncId()}\n`)A()

async_hooks.createHook 可以注册 4 个方法来跟踪所有异步资源的初始化(init)，回调之前(before)，回调之后(after)，销毁后(destroy)事件，并通过调用 .enable() 启用，调用 .disable() 关闭。

这里我们只关心异步资源的初始化和销毁的事件，并使用 fs.writeSync(1, msg) 打印到标准输出，writeSync 第一个参数接收文件描述符，1 表示标准输出。为什么不使用 console.log 呢？因为 console.log 是一个异步操作，如果在 init, before, after, destroy 事件处理函数中出现，就会导致无限循环。

运行该程序，打印如下（重点地方已加粗）：

top level -> 1PROMISE(6): trigger: 1A -> 1Timeout(7): trigger: 1TIMERWRAP(8): trigger: 1A in setTimeout -> 7PROMISE(9): trigger: 7B -> 7TickObject(10): trigger: 7B in process.nextTick -> 10C -> 10PROMISE(11): trigger: 10PROMISE(12): trigger: 11C -> 10PROMISE(13): trigger: 10PROMISE(14): trigger: 13C in promise.then -> 12C in promise.then -> 14destroy: 7destroy: 10destroy: 8

这段程序的打印结果包含了很多信息，我们逐个解释：

为了实现对异步资源的跟踪，node 对每一个函数（不论异步还是同步）提供了一个 async scope，我们可以通过调用 async_hooks.executionAsyncId() 来获取函数当前的 async scope 的 id（称为 asyncId），通过调用 async_hooks.triggerAsyncId() 来获取当前函数调用者的 asyncId。
异步资源创建时触发 init 事件函数，init 函数中的第一个参数代表该异步资源的 asyncId，type 表示异步资源的类型（如 TCPWRAP, PROMISE, Timeout, Immediate, TickObject 等等），triggerAsyncId 表示该异步资源的调用者的 asyncId。异步资源销毁时触发 destroy 事件函数，该函数只接收一个参数即该异步资源的 asyncId。
函数调用关系明确。从上面的打印结果我们可以很容易看出（从下往上看）：C(asyncId: 10) 被 B(asyncId: 7) 调用，B(asyncId: 7) 被 A(asyncId: 1) 调用。而且 C 的 promise.then 里面的 asyncId(值为 12/14) 也可以通过 12/14 -> 11/13 -> 10 定位到 C 的 asyncId(值为 10)。
同步函数每次调用的 asyncId 都一样，如上 C 调用了 2 次，都打印的 C -> 10，与调用方的作用域的 asyncId 一致，即上面打印的 B in process.nextTick -> 10。异步函数每次调用的 asyncId 不一样，如上面打印的 C in promise.then -> 12 和 C in promise.then -> 14。
最外层作用域的 asyncId 总是 1，每个异步资源创建时 asyncId 全局递增。

上面 5 条结论非常重要。你可能会问：给我一堆数字(asyncId)有卵用？别急，接下来我们看看如何使用 async_hooks 改造 koa-yield-breakpoint。

改造 koa-yield-breakpoint

通过前面结论我们知道，使用 async_hooks 可以通过 asyncId 串起函数的调用关系，但是如何将这些函数的调用链与 koa 接收的每个请求关联起来呢?

首先，定义一个全局 Map，存储函数的调用关系：

const async_hooks = require('async_hooks')const asyncIdMap = new Map()async_hooks.createHook({  init (asyncId, type, triggerAsyncId) {    const ctx = getCtx(triggerAsyncId)    if (ctx) {      asyncIdMap.set(asyncId, ctx)    } else {      asyncIdMap.set(asyncId, triggerAsyncId)    }  },  destroy (asyncId) {    asyncIdMap.delete(asyncId)  }}).enable()function getCtx (asyncId) {  if (!asyncId) {    return  }  if (typeof asyncId === 'object' && asyncId.app) {    return asyncId  }  return getCtx(asyncIdMap.get(asyncId))}

有 3 点需要解释：

定义了一个全局 Map 存储函数的调用关系，在适当的地方（下面会讲到）将当前请求的 this 存储到 Map 中，key 是 asyncId。
每个异步资源初始化时，会拿 asyncId 尝试往上寻找祖先的 value 是否是 ctx（koa 应用中每个请求的 this），如果有则直接设置 value 是 ctx，否则设置为调用者的 asyncId（即 triggerAsyncId）。
destroy 事件函数里直接删除调用关系，保证了不会引起内存泄漏，即杜绝了引用 ctx 没有释放的情况。

然后修改 global[loggerName] 如下：

global[loggerName] = function * (ctx, fn, fnStr, filename) {  const originalContext = ctx  let requestId = _getRequestId()  const asyncId = async_hooks.executionAsyncId()  if (!requestId) {    const _ctx = getCtx(asyncId)    if (_ctx) {      ctx = _ctx      requestId = _getRequestId()    }  } else {    asyncIdMap.set(asyncId, ctx)  }  if (requestId) {    _logger('beforeYield')  }  const result = yield * fn.call(originalContext)  if (requestId) {    _logger('afterYield', result)  }  return result  function _getRequestId () {    return ctx && ctx.app && _.get(ctx, requestIdPath)  }  function _logger () {    ...  }}

有 2 点需要解释：

logger 函数传入的第一个参数 ctx，之前是每个请求的 this，现在可能是当前执行上下文的 this，所以先将 ctx 赋值给 originalContext，然后通过 yield * fn.call(originalContext) 函数执行时有正确的上下文。
如果传入的 ctx 是来自请求的 this 且能拿到 requestId，那么将当前 asyncId 和 ctx 写入 Map，如果不是来自请求的 this，则尝试从 Map 里往上寻找祖先的 value 是否是 ctx，如果找到则覆盖当前的 ctx 并拿到 requestId。

至此，koa-yield-breakpoint 全部改造完毕，是不是很简单？

接下来我们通过一个例子验证下升级后的 koa-yield-breakpoint：

app.js

const path = require('path')const koaYieldBreakpoint = require('koa-yield-breakpoint')({  files: [path.join(__dirname, '**/*.js')]})const koa = require('koa')const route = require('koa-route')const app = koa()app.use(koaYieldBreakpoint)app.use(route.post('/users', require('./routes/users').createUser))app.listen(3000, () => {  console.log('listening on 3000')})

routes/users.js

const Mongolass = require('mongolass')const mongolass = new Mongolass('mongodb://localhost:27017/test')const User = mongolass.model('User')const Post = mongolass.model('Post')const Comment = mongolass.model('Comment')exports.createUser = function * () {  const name = this.query.name || 'default'  const age = +this.query.age || 18  yield createUser(name, age)  this.status = 204}function * createUser (name, age) {  const user = (yield User.create({    name,    age  })).ops[0]  yield createPost(user)}function * createPost (user) {  const post = (yield Post.create({    uid: user._id,    title: 'post',    content: 'post'  })).ops[0]  yield createComment(user, post)}function * createComment (user, post) {  yield Comment.create({    userId: user._id,    postId: post._id,    content: 'comment'  })}

这里我们访问创建用户接口时，调用 createUser，createUser 里面又调用了 createPost，createPost 里面调用了 createComment。

$ curl -XPOST localhost:3000/users

打印如下：

{ type: 'start',  step: 1,  take: 0 ... }{ type: 'beforeYield',  step: 2,  fn: 'createUser(name, age)',  take: 1 ... }{ type: 'beforeYield',  step: 3,  fn: 'User.create(...)',  take: 1 ... }{ type: 'afterYield',  step: 4,  fn: 'User.create(...)',  take: 36 ... }{ type: 'beforeYield',  step: 5,  fn: 'createPost(user)',  take: 1 ... }{ type: 'beforeYield',  step: 6,  fn: 'Post.create(...)',  take: 0 ... }{ type: 'afterYield',  step: 7,  fn: 'Post.create(...)',  take: 3 ... }{ type: 'beforeYield',  step: 8,  fn: 'createComment(user, post)',  take: 1 ... }{ type: 'beforeYield',  step: 9,  fn: 'Comment.create(...)',  take: 0 ... }{ type: 'afterYield',  step: 10,  fn: 'Comment.create(...)',  take: 1 ... }{ type: 'afterYield',  step: 11,  fn: 'createComment(user, post)',  take: 1 ... }{ type: 'afterYield',  step: 12,  fn: 'createPost(user)',  take: 6 ... }{ type: 'afterYield',  step: 13,  fn: 'createUser(name, age)',  take: 44 ... }{ type: 'end',  step: 14,  take: 0 ... }

至此，一个全链路、无侵入、强大的日志打点工具就完成了。

参考文献

koa 日志

注：本文内容来自互联网，旨在为开发者提供分享、交流的平台。如有涉及文章版权等事宜，请你联系站长进行处理。