JavaScript 错误处理和堆栈追踪浅析

有时我们会忽略错误处理和堆栈追踪的一些细节, 但是这些细节对于写与测试或错误处理相关的库来说是非常有用的. 例如这周, 对于 Chai 就有一个非常棒的PR, 该PR极大地改善了我们处理堆栈的方式, 当用户的断言失败的时候, 我们会给予更多的提示信息(帮助用户进行定位).

合理地处理堆栈信息能使你清除无用的数据, 而只专注于有用的数据. 同时, 当更好地理解?Errors?对象及其相关属性之后, 能有助于你更充分地利用?Errors.

(函数的)调用栈是怎么工作的

在谈论错误之前, 先要了解下(函数的)调用栈的原理:

当有一个函数被调用的时候, 它就被压入到堆栈的顶部, 该函数运行完成之后, 又会从堆栈的顶部被移除.

堆栈的数据结构就是后进先出, 以 LIFO (last in, first out) 著称.

例如:

function c() {
    console.log('c');
}

function b() {
    console.log('b');
    c();
}

function a() {
    console.log('a');
    b();
}

a();

在上述的示例中, 当函数?a?运行时, 其会被添加到堆栈的顶部. 然后, 当函数?b?在函数?a?的内部被调用时, 函数?b?会被压入到堆栈的顶部. 当函数?c?在函数?b?的内部被调用时也会被压入到堆栈的顶部.

当函数?c?运行时, 堆栈中就包含了?a,?b?和?c(按此顺序).

当函数?c?运行完毕之后, 就会从堆栈的顶部被移除, 然后函数调用的控制流就回到函数?b. 函数?b?运行完之后, 也会从堆栈的顶部被移除, 然后函数调用的控制流就回到函数?a. 最后, 函数?a?运行完成之后也会从堆栈的顶部被移除.

为了更好地在demo中演示堆栈的行为, 可以使用?console.trace()?在控制台输出当前的堆栈数据. 同时, 你要以从上至下的顺序阅读输出的堆栈数据.

function c() {
    console.log('c');
    console.trace();
}

function b() {
    console.log('b');
    c();
}

function a() {
    console.log('a');
    b();
}

a();

在 Node 的 REPL 模式中运行上述代码会得到如下输出:

Trace
    at c (repl:3:9)
    at b (repl:3:1)
    at a (repl:3:1)
    at repl:1:1 // <-- For now feel free to ignore anything below this point, these are Node's internals
    at realRunInThisContextScript (vm.js:22:35)
    at sigintHandlersWrap (vm.js:98:12)
    at ContextifyScript.Script.runInThisContext (vm.js:24:12)
    at REPLServer.defaultEval (repl.js:313:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)

正如所看到的, 当从函数?c?中输出时, 堆栈中包含了函数?a,?b?以及c.

如果在函数?c?运行完成之后, 在函数?b?中输出当前的堆栈数据, 就会看到函数?c?已经从堆栈的顶部被移除, 此时堆栈中仅包括函数?a?和?b.

function c() {
    console.log('c');
}

function b() {
    console.log('b');
    c();
    console.trace();
}

function a() {
    console.log('a');
    b();
}

正如所看到的, 函数?c?运行完成之后, 已经从堆栈的顶部被移除.

Trace
    at b (repl:4:9)
    at a (repl:3:1)
    at repl:1:1  // <-- For now feel free to ignore anything below this point, these are Node's internals
    at realRunInThisContextScript (vm.js:22:35)
    at sigintHandlersWrap (vm.js:98:12)
    at ContextifyScript.Script.runInThisContext (vm.js:24:12)
    at REPLServer.defaultEval (repl.js:313:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:513:10)

Error对象和错误处理

当程序运行出现错误时, 通常会抛出一个?Error?对象.?Error?对象可以作为用户自定义错误对象继承的原型.

Error.prototype?对象包含如下属性：

constructor–指向实例的构造函数
message–错误信息
name–错误的名字(类型)

上述是?Error.prototype?的标准属性, 此外, 不同的运行环境都有其特定的属性. 在例如 Node, Firefox, Chrome, Edge, IE 10+, Opera 以及 Safari 6+ 这样的环境中,?Error?对象具备?stack?属性, 该属性包含了错误的堆栈轨迹. 一个错误实例的堆栈轨迹包含了自构造函数之后的所有堆栈结构.

如果想了解更多关于?Error?对象的特定属性, 可以阅读 MDN 上的这篇文章.

为了抛出一个错误, 必须使用?throw?关键字. 为了?catch?一个抛出的错误, 必须使用?try...catch?包含可能跑出错误的代码. Catch的参数是被跑出的错误实例.

如 Java 一样, JavaScript 也允许在?try/catch?之后使用?finally?关键字. 在处理完错误之后, 可以在?finally语句块作一些清除工作.

在语法上, 你可以使用?try?语句块而其后不必跟着?catch?语句块, 但必须跟着?finally?语句块. 这意味着有三种不同的?try?语句形式:

try...catch
try...finally
try...catch...finally

Try语句内还可以在嵌入?try?语句:

try {
    try {
        throw new Error('Nested error.'); // The error thrown here will be caught by its own `catch` clause
    } catch (nestedErr) {
        console.log('Nested catch'); // This runs
    }
} catch (err) {
    console.log('This will not run.');
}

也可以在?catch?或?finally?中嵌入?try?语句:

try {
    throw new Error('First error');
} catch (err) {
    console.log('First catch running');
    try {
        throw new Error('Second error');
    } catch (nestedErr) {
        console.log('Second catch running.');
    }
}

try {
    console.log('The try block is running...');
} finally {
    try {
        throw new Error('Error inside finally.');
    } catch (err) {
        console.log('Caught an error inside the finally block.');
    }
}

需要重点说明一下的是在抛出错误时, 可以只抛出一个简单值而不是?Error?对象.?尽管这看起来看酷并且是允许的, 但这并不是一个推荐的做法, 尤其是对于一些需要处理他人代码的库和框架的开发者, 因为没有标准可以参考, 也无法得知会从用户那里得到什么. 你不能信任用户会抛出?Error?对象, 因为他们可能不会这么做, 而是简单的抛出一个字符串或者数值. 这也意味着很难去处理堆栈信息和其它元信息.

例如:

function runWithoutThrowing(func) {
    try {
        func();
    } catch (e) {
        console.log('There was an error, but I will not throw it.');
        console.log('The error\'s message was: ' + e.message)
    }
}

function funcThatThrowsError() {
    throw new TypeError('I am a TypeError.');
}

runWithoutThrowing(funcThatThrowsError);

如果用户传递给函数?runWithoutThrowing?的参数抛出了一个错误对象, 上面的代码能正常捕获错误. 然后, 如果是抛出一个字符串, 就会碰到一些问题了:

function runWithoutThrowing(func) {
    try {
        func();
    } catch (e) {
        console.log('There was an error, but I will not throw it.');
        console.log('The error\'s message was: ' + e.message)
    }
}

function funcThatThrowsString() {
    throw 'I am a String.';
}

runWithoutThrowing(funcThatThrowsString);

现在第二个?console.log?会输出undefined. 这看起来不是很重要, 但如果你需要确保?Error?对象有一个特定的属性或者用另一种方式来处理?Error?对象的特定属性(例如?Chai的throws断言的做法), 你就得做大量的工作来确保程序的正确运行.

同时, 如果抛出的不是?Error?对象, 也就获取不到?stack?属性.

Errors 也可以被作为其它对象, 你也不必抛出它们, 这也是为什么大多数回调函数把 Errors 作为第一个参数的原因. 例如:

const fs = require('fs');

fs.readdir('/example/i-do-not-exist', function callback(err, dirs) {
    if (err instanceof Error) {
        // `readdir` will throw an error because that directory does not exist
        // We will now be able to use the error object passed by it in our callback function
        console.log('Error Message: ' + err.message);
        console.log('See? We can use Errors without using try statements.');
    } else {
        console.log(dirs);
    }
});

最后,?Error?对象也可以用于 rejected promise, 这使得很容易处理 rejected promise:

new Promise(function(resolve, reject) {
    reject(new Error('The promise was rejected.'));
}).then(function() {
    console.log('I am an error.');
}).catch(function(err) {
    if (err instanceof Error) {
        console.log('The promise was rejected with an error.');
        console.log('Error Message: ' + err.message);
    }
});

处理堆栈

这一节是针对支持?Error.captureStackTrace的运行环境, 例如Nodejs.

Error.captureStackTrace?的第一个参数是?object, 第二个可选参数是一个?function.?Error.captureStackTrace?会捕获堆栈信息, 并在第一个参数中创建?stack?属性来存储捕获到的堆栈信息. 如果提供了第二个参数, 该函数将作为堆栈调用的终点. 因此, 捕获到的堆栈信息将只显示该函数调用之前的信息.

用下面的两个demo来解释一下. 第一个, 仅将捕获到的堆栈信息存于一个普通的对象之中:

const myObj = {};

function c() {
}

function b() {
    // Here we will store the current stack trace into myObj
    Error.captureStackTrace(myObj);
    c();
}

function a() {
    b();
}

// First we will call these functions
a();

// Now let's see what is the stack trace stored into myObj.stack
console.log(myObj.stack);

// This will print the following stack to the console:
//    at b (repl:3:7) <-- Since it was called inside B, the B call is the last entry in the stack
//    at a (repl:2:1)
//    at repl:1:1 <-- Node internals below this line
//    at realRunInThisContextScript (vm.js:22:35)
//    at sigintHandlersWrap (vm.js:98:12)
//    at ContextifyScript.Script.runInThisContext (vm.js:24:12)
//    at REPLServer.defaultEval (repl.js:313:29)
//    at bound (domain.js:280:14)
//    at REPLServer.runBound [as eval] (domain.js:293:12)
//    at REPLServer.onLine (repl.js:513:10)

从上面的示例可以看出, 首先调用函数?a(被压入堆栈), 然后在?a?里面调用函数?b(被压入堆栈且在a之上), 然后在?b?中捕获到当前的堆栈信息, 并将其存储到?myObj?中. 所以, 在控制台输出的堆栈信息中仅包含了?a和?b?的调用信息.

现在, 我们给?Error.captureStackTrace?传递一个函数作为第二个参数, 看下输出信息:

const myObj = {};

function d() {
    // Here we will store the current stack trace into myObj
    // This time we will hide all the frames after `b` and `b` itself
    Error.captureStackTrace(myObj, b);
}

function c() {
    d();
}

function b() {
    c();
}

function a() {
    b();
}

// First we will call these functions
a();

// Now let's see what is the stack trace stored into myObj.stack
console.log(myObj.stack);

// This will print the following stack to the console:
//    at a (repl:2:1) <-- As you can see here we only get frames before `b` was called
//    at repl:1:1 <-- Node internals below this line
//    at realRunInThisContextScript (vm.js:22:35)
//    at sigintHandlersWrap (vm.js:98:12)
//    at ContextifyScript.Script.runInThisContext (vm.js:24:12)
//    at REPLServer.defaultEval (repl.js:313:29)
//    at bound (domain.js:280:14)
//    at REPLServer.runBound [as eval] (domain.js:293:12)
//    at REPLServer.onLine (repl.js:513:10)
//    at emitOne (events.js:101:20)

当将函数?b?作为第二个参数传给?Error.captureStackTraceFunction?时, 输出的堆栈就只包含了函数?b?调用之前的信息(尽管?Error.captureStackTraceFunction?是在函数?d?中调用的), 这也就是为什么只在控制台输出了?a. 这样处理方式的好处就是用来隐藏一些与用户无关的内部实现细节.

参考

JavaScript Errors and Stack Traces in Depth

JavaScript 错误处理和堆栈追踪浅析

(函数的)调用栈是怎么工作的

Error对象和错误处理

处理堆栈

参考

JavaScript运行机制浅析

JavaScript Function 函数深入总结

2016 年 JavaScript 技术栈展望

为什么说JavaScript中的DOM操作很慢

12行JS代码的DoS攻击分析及防御