libuv 避坑指南

by CoderZh 2022年04月03日 2340 Words ~ 5min reading time | Improve on

libuv 的 uv 是 Unicorn Velociraptor（独角伶盗龙）的缩写，所以它的 Logo 是一只独角伶盗龙的图片。

libuv 是一个跨平台的专注于异步 IO 的库。主要设计用于 Node.js，像 electron 是基于 Node.js 的，vscode 又是基于 electron 的，所以 libuv 是一个使用广泛，影响力很大的一个库。除了 Node.js 里的应用，在其他软件里也应用广泛。

它提供了基于事件循环的异步 IO 支持。它支持 epoll、kqueue、Windows 的 IOCP 和 Solaris 的事件端口。最初它是 libev 和 IOCP 上的抽象。其中 libev 又是参考了 libevent（2002年第一个版本），能查到的最早的 1.85 版本 libev 是 2007 年。libev 最大的一个问题是不支持 Windows 的 IOCP，所以才会有后来的 libuv。

libuv 能下载到的最早版本是 2013 年的 v0.10.10。 Node.js 在 v0.9.0 版本（2012年）的 libuv 里去除了对 libev 的依赖，为了更好适配 Windows 平台。所以可以理解为 libuv 第一个重大意义的版本应该是在 2012 年。

梳理一下时间线：

libuv 使用 C 语言编写，使用起来也比较简单。

#include <stdio.h>
#include <stdlib.h>
#include <uv.h>

int main() {
  uv_loop_t* loop = (uv_loop_t*)malloc(sizeof(uv_loop_t));
  uv_loop_init(loop);

  printf("Now quitting.\n");
  uv_run(loop, UV_RUN_DEFAULT);

  uv_loop_close(loop);
  free(loop);
  return 0;
}

上面例子由于没有任何事件发生，所以 uv_run 里的事件循环马上就退出了。

当我们对特定的事件感兴趣时，比如：计时器、I/O设备、TCP/UDP、异步事件等等，我们需要创建相应的 handle，类型的名字使用 uv_TYPE_t 的形式：

typedef struct uv_handle_s uv_handle_t;
typedef struct uv_tcp_s uv_tcp_t;
typedef struct uv_udp_s uv_udp_t;
typedef struct uv_pipe_s uv_pipe_t;
typedef struct uv_timer_s uv_timer_t;
typedef struct uv_idle_s uv_idle_t;
typedef struct uv_async_s uv_async_t;
// ...

uv_TYPE_t 结构包含了事件所需的数据，同时各类 uv_TYPE_t 在内存布局上，拥有相同的起始内存布局（uv_handle_t）。以 uv_idle_t 举例：

struct uv_idle_s {
  UV_HANDLE_FIELDS
  UV_IDLE_PRIVATE_FIELDS
};

再来看一个使用 uv_idle_t 的示例：

#include <stdio.h>
#include <uv.h>

int64_t counter = 0;

void wait_for_a_while(uv_idle_t* handle) {
  counter++;

  if (counter >= 10e6)
    uv_idle_stop(handle);
}

int main() {
  uv_idle_t idler;

  uv_idle_init(uv_default_loop(), &idler);
  uv_idle_start(&idler, wait_for_a_while);

  printf("Idling...\n");
  uv_run(uv_default_loop(), UV_RUN_DEFAULT);

  uv_loop_close(uv_default_loop());
  return 0;
}

上面示例创建了一个 uv_idle_t 的实例，初始化（uv_idle_init）并启动了它（uv_idle_start）。每次 loop 循环里，都会触发 idle 的事件回调：wait_for_a_while。直到计数器 counter 大于等于 10000000 次，idle 事件停止，接着由于没有了任何有效的事件 handle，uv_run 内循环返回，退出。

这个例子是很好理解的，和其他 libuv 自带的例子一样。但是这些例子都没有告诉你一个事情：handle 什么时机释放？大多数例子里，handle 都是作为一个全局的静态变量存在的，或者像上面例子里的 idler 对象，都是在等待所有 uv 环境都退出，或者在进程退出前，才去释放的 handle 内存。

实际应用会比示例里的复杂，不可能将所有使用到的 handle 都放在全局对象里。当我们调用 uv_TYPE_stop 时，只是告诉 uv，这个 handle 不再是 ACTIVE 的状态。如果我们后续不再使用该 handle，应该调用 uv_close，告诉 uv 关闭这个 handle 吧，后面我们也不会再用它。

那么，我们是否在调用 uv_close 之后，就立即去释放 handle 的内存呢？这是一个容易踩坑的点。我们看 libuv 关于 uv_close 的文档：

void uv_close(uv_handle_t* handle, uv_close_cb close_cb)

Request handle to be closed. `close_cb` will be called asynchronously after
this call. This MUST be called on each handle before memory is released.
Moreover, the memory can only be released in `close_cb` or after it has
returned.

上面的说明有几个点：

close_cb 回调是异步被触发的。
每个 handle 都要调用 uv_close。
handle 的内存释放，必须在 close_cb 回调里，或者之后。

文档虽然写清楚了，但又有多少人认真阅读文档并执行呢？而我认为，这是 libuv 接口设计的缺陷，它将接口和内部的实现交错在了一起，使用者需要了解其内部的潜规则，才不会用错，才不会造成 Crash。

之所以调用 uv_close 后不能立即释放 handle 内存，是因为 uv_close 内部不是立即触发 close 和 close 回调。而是先将 handle 状态修改为 closing 和非 ACTIVE，然后加入 loop 的 closing 队列里。直到循环结束前触发 uv__run_closing_handles，执行真正的 close 和回调，在这之后，handle 的内存才能释放，才不会再被 libuv 使用到。

这是其中一个容易踩坑的点。另一个也是关于退出的流程。

libuv 的示例里，包含的几乎是正常的退出流程：各个 handle 都主动退出了，然后 uv_run 退出。实际遇到的情况是，handle 还处于有效的 ACTIVE 状态时，需要退出怎么办？调用 uv_loop_close ？不行，handle 还处于 ACTIVE 状态，uv_loop_close 会返回 UV_EBUSY。

正确的做法是调用 uv_stop，设置 loop 的状态为 stop，然后 uv_run 内的循环能退出出来。循环退出了，但是那些 ACTIVE 的 handle 怎么办？handle 何时 stop，内存如何释放？libuv 似乎没有明确告诉你应该怎么做。

uv_stop 前，应该通知各个模块把相关的 uv handle 都 close 掉。假设有漏网之鱼，怎么办？常见的做法是在 uv_stop 之后通过 uv_walk 遍历所有 handle，然后 close 掉。

uv_stop(loop);

auto const ensure_closing = [](uv_handle_t* handle, void*) {
  if (!uv_is_closing(handle)) {
    uv_close(handle, nullptr);
  }
};
uv_walk(loop, ensure_closing, nullptr);

但是，问题又来了，这样的兜底 uv_close 并没有传 close 回调，那 handle 的内存又如何删除？假如有漏网之鱼，就让它泄露算了？好像也没有太好的办法，只能尽量做好 handle 管理，能主动进行 uv_close，而不是等到 uv_walk 里被兜底 close。

接着又有另外一个问题，前面说到 uv_close 只是加入 loop 的 close 队列，并没有触发真正的 close 和回调。这时要确保这些 handle 被 close，则需要再次触发 uv_run 直到所有 handle close 完毕。

for (;;)
  if (uv_run(loop, UV_RUN_DEFAULT) == 0)
    break;

uv 已经要退出了，但为了执行前面的 uv__run_closing_handles，又触发了 uv_run 的执行。假设哪里处理不好，其他线程又往 uv 线程抛了新的 handle 任务，这时就是灾难，很可能 crash 就会扑面而来。所以这里一定要做好状态管理，一旦进入 uv_stop 流程，其他线程也不应该再往 uv 抛任务。

上面提到的问题不是只有新手才会犯，我们可以看看著名的 electron 项目，也是犯了同样的错误：

https://github.com/electron/electron/pull/25332

This PR wraps the uv_async_t objects owned by NodeBindings and ElectronBindings inside a new UvHandle wrapper class which handles uv_handle_ts’ specific rules about destruction:

[uv_close()] MUST be called on each handle before memory is released. Moreover, the memory can only be released in close_cb or after it has returned.

The UvHandle wrapper class handles this close-delete twostep so that client code doesn’t have to think about it. Failure to finish closing before freeing is what caused the uv_walk() crash in #25248.

为了确保 handle 正确被释放，electron 对 uv 的 handle 做了一层包装，很好的方法，可以借鉴一下：

template <typename T,
          typename std::enable_if<
              // these are the C-style 'subclasses' of uv_handle_t
              std::is_same<T, uv_async_t>::value ||
              std::is_same<T, uv_check_t>::value ||
              std::is_same<T, uv_fs_event_t>::value ||
              std::is_same<T, uv_fs_poll_t>::value ||
              std::is_same<T, uv_idle_t>::value ||
              std::is_same<T, uv_pipe_t>::value ||
              std::is_same<T, uv_poll_t>::value ||
              std::is_same<T, uv_prepare_t>::value ||
              std::is_same<T, uv_process_t>::value ||
              std::is_same<T, uv_signal_t>::value ||
              std::is_same<T, uv_stream_t>::value ||
              std::is_same<T, uv_tcp_t>::value ||
              std::is_same<T, uv_timer_t>::value ||
              std::is_same<T, uv_tty_t>::value ||
              std::is_same<T, uv_udp_t>::value>::type* = nullptr>
class UvHandle {
 public:
  UvHandle() : t_(new T) {}
  ~UvHandle() { reset(); }
  T* get() { return t_; }
  uv_handle_t* handle() { return reinterpret_cast<uv_handle_t*>(t_); }

  void reset() {
    auto* h = handle();
    if (h != nullptr) {
      DCHECK_EQ(0, uv_is_closing(h));
      uv_close(h, OnClosed);
      t_ = nullptr;
    }
  }

 private:
  static void OnClosed(uv_handle_t* handle) {
    delete reinterpret_cast<T*>(handle);
  }

  T* t_ = {};
};

这说明 libuv 还是容易踩坑的。

微信扫一扫交流

作者：CoderZh
微信关注：hacker-thinking （代码随想）
本文出处：https://blog.coderzh.com/2022/04/03/libuv/
文章版权归本人所有，欢迎转载，但未经作者同意必须保留此段声明，且在文章页面明显位置给出原文连接，否则保留追究法律责任的权利。

libuv 避坑指南

Search

Categories