(HotOS'03 Note)Crash-Only Software
Crash-Only Software
George Candea et al.
HotOS 2003
作者认为所有软件都会crash,不能避免crash,就crash-only。这样,软件要么成功,要么crash,不会存在“带病运行”等中间情况,从而可以简单的使用重新启动的方法来进行错误恢复。Candea等在OSDI 2004中有一篇关于Microreboot的文章,就是在此基础上深入的。
Crash-only systems are built from crash-only components, and the use of transparent component-level retries hides intra-system component crashes from end-users.
文中介绍了实现crash-only的一些准则:
Intra-component:
1. All important non-volatile state is managed bye dedicated state stores.
Inter-component:
George Candea et al.
HotOS 2003
作者认为所有软件都会crash,不能避免crash,就crash-only。这样,软件要么成功,要么crash,不会存在“带病运行”等中间情况,从而可以简单的使用重新启动的方法来进行错误恢复。Candea等在OSDI 2004中有一篇关于Microreboot的文章,就是在此基础上深入的。
Crash-only systems are built from crash-only components, and the use of transparent component-level retries hides intra-system component crashes from end-users.
文中介绍了实现crash-only的一些准则:
Intra-component:
1. All important non-volatile state is managed bye dedicated state stores.
Inter-component:
- Components have externally enforced boundaries;
- All interactions between components have a timeout;
- All resources are leased;
- Requests are entirely self-describing.

Post a Comment