Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time-Travel diagnostics in Node #164

Closed
mrkmarron opened this issue Feb 15, 2018 · 7 comments
Closed

Time-Travel diagnostics in Node #164

mrkmarron opened this issue Feb 15, 2018 · 7 comments

Comments

@mrkmarron
Copy link

At the Node Diagnostics WG there was discussion around what is needed to make time-travel debugging more generally available. Key work items identified to make this happen include:

  1. Work to refactor certain coding idioms into more record/replay friendly forms. One example includes the AliasedBuffer class.
  2. Discussion around user experience and surfacing of this functionality. We have experimented with several aspects:
    • Adding a new domain, TimeTravel, to the debugger API.
    • Using this to added time-travel support to the builtin inspect debugger as well as VSCode.
    • Adding a new module, trace_mgr.js, to enable simple generation of traces at common error points (unhandled exceptions, abnormal exits, console.error writes, etc.).
    • Getting additional feedback on the value and impact of these use cases would be very useful in prioritizing and adjusting the implementation.
  3. Structuring code to make the implementation of the needed record/replay/snapshot code as VM neutral and easy as possible. More below.

Providing TimeTravel as a functionality in Node uniformly instead of as a vendor specific feature is a challenge.

One option is for each vendor to implement it primarily in the VM as is currently done in ChakraCore now. This approach has the upside of requiring minimal changes in Node but is likely to be problematic as it requires extensive duplication of almost identical functionality in each VM providing the support and, in the case of V8 since Node currently calls directly into their raw API's, the addition record/replay support presents potentially large maintainability and performance concerns.

The second approach, as discussed during the WG, involves using N-API to move from direct calls to V8 API's. This has the immediate benefit of a single location to put a large body of common record/replay code currently implemented in Chakra's JsRT host embedding API (for example here, which drives the logger in ChakraCore) with a neutral shared implementation based around the N-API specification. As N-API is at a higher level that JsRT this will also have the benefit of decreasing the overhead of running in record mode. However, there are several items that need investigation/work for this approach:

  1. The record/replay code must track identity for opaque references passed to/from various calls. ChakraCore has a non-moving collector so we can trivially just take pointer ids. Other VM's may use moving collectors which require more sophisticated approaches, e.g., adding an explicit tag field to a napi_value or updating tag info whenever the GC moves it.
  2. A N-API style model needs to be adopted in core so that there are no longer any direct calls through V8 API's.
  3. A smaller API needs to be defined which the VM can implement to support any features which cannot be included in the general Node layer, e.g., record Data.Time calls in the VM, take/restore a snapshot, etc.
@mcollina
Copy link
Member

Good writeup. I fear the N-API approach is not feasible - adopting that would have a noticeable performance overhead, and I'm not confident if the overall group would accept the tradeoff.

@mhdawson
Copy link
Member

Is there a way to limit the scope of

"A N-API style model needs to be adopted in core so that there are no longer any direct calls through V8 API's."

to something smaller than "everything". I agree that we are long way from removing all direct references to v8, but if a smaller subset is possible (even if that only gives us a subset of the history), then there might be a way to move towards incremental progress. Otherwise we are probably stuck with the option of it being implemented in the JavaScript engines for at least a number of years.

@mrkmarron
Copy link
Author

One other option is to take the approach of incrementally moving towards a full N-API implementation via conditional compilation to provide a "diagnostic" build with some overhead and a zero cost production build during the process of N-API'ing code and working out performance issues that arise.

The limited scope TTD implementation could be possible but I think it could be difficult for people to understand when/why TTD doesn't work in some cases.

@github-actions
Copy link

This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.

@mmarchini
Copy link
Contributor

Was going to let this be closed, but @bnb mentioned something about it on Twitter so maybe there's hope this will be a Node.js/V8 feature some day ^^

@bnb
Copy link

bnb commented Jul 21, 2020

Extending that, if folks have front-end use cases (obviously not the right forum here but want to be clear on my... path to success) and would like to really see this land in Chromium (leading to it being available in Edge, Chrome, Electron, and other Chromium-based projects) please reach out to me.

I'd like to chat with you about yourself, why you want it, where you want it, and how it could be valuable to you.

@github-actions
Copy link

github-actions bot commented Aug 4, 2022

This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants