Closed Bug 65080 Opened 25 years ago Closed 16 years ago

XPCOM needs rich error reporting

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: markh, Assigned: markh)

Details

This RFE is to track a potential new XPCOM feature - allowing XPCOM components to provide rich error information in the case of failure. The basic problem is that the XPCOM nsresult codes often do not convey enough information. For example, consider an XPCOM component that fails due to some underlying IO operation failing. If there was a technique so that the underlying operating system's error message could be returned to the caller, diagnostics become that much simpler. We have this specific requirement in Komodo - some components are responsbile for file IO. If this file IO fails, we would like detailed information to be able to present to the user (ie, is the file read-only, is it a permissions problem, or simply "file not found"??). We currently hack this by adding extra "out string status" params to many many methods. Another example is the Javascript bindings - "unexpected" errors in a js component will generally yield an NS_ERROR_FAILURE, with specific details logged to the console. However, this means there is no reasonable way for the caller to get these specific details (eg, the filename etc) for special handling of the error. Similarly, if a Python component fails, there is no way that the js caller to see the specific error details (eg, the specific Python exception and tracback) I propose that a scheme similar to MSCOM be used: * 2 interfaces are designed and used - nsISupportsErrorInfo and nsIErrorInfo. - nsISupportsErrorInfo is used to determine if a component supports rich error info and if any such error information is available at this instant. - nsIErrorInfo is used to extract the specific error information, if available. * When a component fails, the caller can QI the failing interface for nsISupportsErrorInfo. If this succeeds, the component supports extended errors. Once obtained, nsISupportsErrorInfo is used to determine if extended info is available for the most recent failure, yielding an nsIErrorInfo. If we can get agreement that such a facility is a good idea I am happy to start thinking about the interface design.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Would it make more sense (in this context) for Komodo to take the Console Service and replace it with windowed output? This isn't saying its a bad idea but there's a lot of work involved in reflecting that discursive information into a central service.
I agree that the need for something like this is needed. Unlike MSCOM though I would be happy with not registering the IErrorInfo interface in a global place, instead, as Mark states, have it be retrievable directly from the erroring object. I think just by defining these interfaces Mark (Komodo) and myself (and the project I am working on) would get what we need, which is richer error information without any additional service (Mark's method would not involve reflecting the information into a central service, not at least as far as I understand it.)
True. There would be many error interfaces, but the work at the component level would be the same whether it was to populate a central resource or a local component one. Mark has identified file i/o as one important interface should that be the first one? Is IErrorInfo sufficient as a model? it sounds like you want more than that gives.
One strategy is to take what xpconnect does and generalize it (then convert xpconnect to use the more general system). See xpcexception.idl for its exception interface. Then there is a more detailed - structured - information object it adds it as the 'data' attribute. Look at nsIScriptError too. Since you mention the inability to get extended error information from JS, I'll point out the system that is in place that would would likely give you more info than you know about. xpconnect *does* store per-thread exception objects that map to nsresult errors. If you call nsIXPConnect::GetPendingException (i.e. the 'PendingException' readonly attribute on that interface) you get the nsIXPCException object for the last error on the current thread (or null if the last opperation succeeded. You could dig in a bit and see if the way this works is applicable to what you'd like to do elsewhere. I have no problem with such a global service getting built, but I expect that a limited set of callees would actually use it. One issue is that the cost of telling the service to *clear* the last error on every success would be staggering - so this would not happen everywhere. Perhaps only callers who know they might need to retrieve an error excepetion (if the call they are about to make should fail) would clear the error so that they would know that if an error happens and they subsequently call the service then any pending info is known to be fresh and not left over from who-knows-when. The other big space overhead issue is the bloat of adding code to set error info. We talked once about a set of "RETURN_ERROR(code,...)" macros that we could put all over the code that would hide the details of doing some call to get the error service and (potentially) format the info and set the pending error. One way or another this would be more bloat than "return NS_ERROR_FOO;" I suppose, really, that there are a limited set of classes that are going to really set and get these exceptions. Rather than incurr the bloat everywhere, we'd just use this functionality where we need it. You *could* just use out params for extended error info, but there may well be other layers between the caller who cares and the callee who sets the info. I agree this service would be a fine thing.
Getting the error info *later* directly from the object that gives the error is messy. Often the interested caller doesn't even have direct access to the object that really caused the error. It also requires another interface (and vtbl ptr) on each instance of such an object. Is it not better to have the erroring object gather and store the interesting error info in a central location? Or, at least, store a reference to an object that can be asked to produce that info - this would leave a stong ref back to the object until the error is cleared, but this is not so bad.
This is actually the method used by Microsoft, since you query the interface you are calling on which errored to see if it supports enhanced error info, then you call GetErrorInfo which gets the IErrorInfo pointer which was registered by the erroring object. This limits the "bloating" of the individual objects. Plus the object which errors can generate an entirely new object which implements IErrorInfo and store this globally (as is the case in MSCOM) thus eliminating the circular reference. The nice part of this is whole thing is that you only have to "bloat" the interfaces where returning rich error information would be useful. If it is not useful, then those implementations can just stick to the normal nsresult returns, and not implement ISupportsErrorInfo.
Summary: [RFE] XPCOM needs rich error reporting → XPCOM needs rich error reporting
I have exchanged some email with jband, and it appears the following would keep both me and the NS/Moz guys happy. It is quite different to the MS model, but I think this design is reasonable. We are making the following assumptions: * Almost noone will want to code explicit support for extended error information. Although we may provide helper functions etc for C++ implemented components, it will only be used rarely from this language. * The biggest users of this facility will be the language bindings - currently limited to Javascript and Python, but soon to include Perl ;-) Thus, we dont want to add overhead to _every_ XPCOM object, and dont even want to add overhead to all objects that may potentially provide such information. The basic plan is: * Implement a new service that is used to save and fetch error information on a per-thread basis. * Anyone wishing to fetch error information can do so at any time - they can simply query the service for the current error on the current thread. To ensure that an error is not stale, anyone who can use extended error information must explicitly clear the error state before they make an XPCOM call. * Anyone wishing to set an error simply registers the error information with the service. It remains "current" until explicitly cleared, fetched, or the thread dies (assuming TLS allows that!) [We are now getting beyond jband and my mail conversation!] I propose that the error service simply store a strong reference to nsIScriptError interfaces, rather than the error fields directly. One key advantage is that in some cases, there may be an advantage in being able to QI for a specific interface known by the error handler. <blue-sky-example>For example, it may make sense for Python to have an "nsIPythonError", so it can "combine" errors even across XPCOM boundaries</blue-sky-example>, and later <even-bluer>we may even allow nsIScriptDebuggerSomething that could be used for JIT style debugging</even-bluer>. The consumer of the error objects would just QI for any additional interfaces once it has the base error interface. The hope is that users of "scripting languages" will have a trivial way of setting and getting the error. This will certainly be true for Python. Eg: raise ServerException(NS_ERROR_WHATEVER, "here is some extra text") is all that would be needed to raise one of these, and: except COMException, e: print e.message would be used to extract them (the "spelling" may change, tho). Any "unexpected" Python errors would have a standard Python exception string - thus callers would have the Python error message available to them with no extra work. I assume js could do something similar, so the JS error text would then be available from other languages that invoked a failed call. I dont know enough about js syntax to suggest exactly how this would work there, but assume it is very doable. Does this sound reasonable? If we can bash the above into something that has general agreement, I will knock up some interfaces for comment, then move towards specific patches.
Just been discussing this with jst/jag/Sliver on #mozilla, specifically how it relates to what jst needs for his DOM work. My understanding: In a nutshell, jst needs to be able to provide an error message for a specific nsresult coming back from the DOM. However, he does not want to burden single COM entry point with this "extended error" logic, and it doesn't make sense to invent a completely new extended error technique just for the DOM. The idea we came up with is for a "module error provider object". This is a componment that knows how to map nsresults for a specific module back to extended error information. This would then make the logic for getting extended error similar to this: * See if an explicit error has been registered with the service. * If not, attempt to create a component/use a service with a contract ID similar to "...;errormodule=14" (assuming we are looking at a DOM error, as NS_ERROR_MODULE_DOM==14), pass it back the nsresult and get an nsIScriptError out of it. * Otherwise no extended error information is available for this call. This technique should allow almost zero overhead when C++ is calling the DOM (as the C++ code will never ask for the extended error and therefore the DOM will never create any such object), and prevent jst from touching every method. It also guarantees that any scripting language will see the exact same DOM error information, as long as they use the new error service. Random notes on this: * This new "module error provider" must be capable of being stateless - ie, it is only passed the nsresult of the error, and not the object that caused the error, or any other "state" information. * Due to this stateless nature, the DOM error provider will be very similar to a "generic" error service that simply maps nsresult to their text names (eg, the string "NS_ERROR_FAILURE"). The DOM has a single additional requirement of an integer "code" that makes it not quite generic. * This stateless nature means that in some cases (possibly only in the future) the DOM will still explicitly register errors with the error service. This would be used where the nsresult is not enough to get back meaningful error information. Does this sound reasonable?
So with this scheme I would imagine that XPConnect/PyXPCOM/random_binding would do something along these lines when calling into C++ code (or any language, for that matter): rv = call whatever method that's being called; if (NS_FAILED(rv)) { nsCOMPtr<nsIScriptError> err; nsCOMPtr<nsIErrorService> es(do_GetService(NS_ERR_SRV_CID); if (es) { es->GetCurrentError(getter_AddRefs(err)); } if (!err) { char buf[256]; sprintf(buf, "%s%x", NS_SCRIPT_ERROR_CONTRACTID_PREFIX, NS_ERROR_GET_MODULE(rv)); err = do_CreateInstance(buf); } if (err) { use err to throw the exception... } else { do what's done today... } } This does allow for any component (the DOM is in no way special here) to supply it's own exception objects for it's own errors with no overhead what so ever in the success case, as long as they don't need to use the error service. Does this seem reasonable?
Oh, and this also has no code size/complexity overhead in the compnent, as long as the error service isn't used (apart from the implementation of the exception object if a generic one can't be used).
Actually, I was picturing that nsIErrorService::GetCurrentError() could take the nsresult as a param, and put that additional logic in there. Thus, language bindings only need the simple GetCurrentError(), and need not worry about the scheme changing or more sophisticated error extraction techniques being invented in the future. This still fits logically IMO as long as we dont think of the error service as a "respository", but as a "provider". But whatever - I don't think this implementation detail changes the basic idea, which still seems reasonable to me.
Having to pass the current error code to GetCurrentError() seens a bit odd, but sure, I'd buy that too since having this logic hiddne in one place would be nice.
Wouldn't GetCurrentError be brittle in multi-threaded environments? I'd prefer a GetMessageForResult, or something along the lines. (Having (a range of) dynamic nsresult mappings could even make up for errors with source info, not necessarily a nice solution, but a workaround). Axel
As earlier comments on the bug say, the current error is thread specific. The point of this bug is not primarily to map nsresult codes to strings, but to allow arbitary exception information to move across xpcom boundaries. It just so happens we can add an nsresult mapping feature to solve a problem for the DOM IDL effort.
I think the intent is to have the error on a per thread basis. The only real point of disagreement I have is giving the responsibility of clearing the error to the caller rather than have it cleared on access. This also reduces the need for errors to be stacked.
A few things... I'm against encoding additional info into contractids. I'm a lot happier when they map one-to-one with some class and don't add init info. In jst's example I'd make that: nsCOMPtr<nsIScriptError> err = do_CreateInstance(DOM_EXCEPTION_CONTRACTID); if(err) err.Init(rv); If we are going to have per-thread data then factor out the service to provide per-thread objects that manage the error data so that every call need not get TLS. What I mean is something like: interface nsIErrorManager { attribute nsIError error; }; interface nsIErrorManagerService : nsIErrorManager { readonly attribute nsIErrorManager managerForCurrentThread; }; With this you can get the service and either ask for the Manager for the current thread (which you could cache and reuse on this thread - only!) or you can call its inherited interface directly and it will fetch TLS on each call. Being able to get and hold the manager (or whatever it is called) for the current thread is critical for xpconnect. I'm already going to have to get TLS once to process a call. I don't want this sub-system doing more of them behind my back on each call. The JSContextStack stuff is 'broken' in this regard and I intend to fix it. I think you need to figure out what interfaces you want for the errors. Are these just script-interface-like errors or more general or what? No one wants to do QI calls on an exception object to figure out what to do with it. I'm still not 100% convinced that each language mapping can't do this whole thing in its own way. I understand that unifying is good in some ways, but I wonder how much it will matter. On the question of the caller pre-clearing... This is not required. It is mostly a mechanism for the caller to be *certain* that the 'current' error is not stale. Another mechanism is to compare the current nsresult error code with the one stored in the exception object and consider a match to be 'good-enough' evidence that the exception is currrent. I expect that script language mappings will always clear before calling so that they can see if the callee built an exception on its own or if the language mapping code should build (and store) an exception itself.
As I understand it the error service mechanism is simply a lightweight and optional way to get possibly complicated information across the XPCOM bridge without needing to add extra bindings between the caller and the component. It would be very messy to have to QI each errorservice to discover what kind of information was going to be sent and I'm nervous of moving aggregated data in structures around. Instead a very simple storing of the nsResult and the extra information in a string internally comma delimited etc, would suffice. This information is largely going to be used in error dialogs and similar and not further analysed. However, treating the same error codes as the same error isn't going to be good enough because the extended information could well be entirely different. For instance, a file open error in a script happens on line 20 and again with a different file on line 200. The cause might be the same but the actual error is different and sometimes its going to be important for that difference to be known.
A quick update on this. The recent XPCOM DOM changes introduced an nsIException and related interfaces. xpconnect's exception interface now derives from nsIException, so we are very well placed to move on this. There is still some work to better integrate DOM exceptions with nsIException, and this should get us 90% of the way there. The final thing remaining will then be to get xpconnect supporting these exception interfaces at the entry and exit points.
...for your eventual consideration, Edward :-)
Assignee: scc → kandrot
Summary: XPCOM needs rich error reporting → [RFE] XPCOM needs rich error reporting
Adding David, being the new xpconnect owner. David - I need to chat with you about this soon...
Here are some further thoughts on the exception/error stuff that David and I have been discussing in IRC. Maybe the rest of this diatribe could be massaged into a document or implementation guide for future language implementers. XPCOM Exception Mechanisms and Implementation Guide --------------------------------------------------- When a language such as JavaScript, Python or Ruby supports XPCOM exceptions (see nsIExceptionService and nsIException), there are 2 scenarios of interest: 1) When the language calls an external XPCOM component, this component may have set ("raised") an XPCOM exception. 2) When the language has an "unhandled" exception - ie, when the language is at its top stack frame, and has an exception (XPCOM or otherwise) that has not been handled via the language's internal exception handling constructs. Often this will be when the language has implemented an XPCOM component, and it needs to return control to the XPCOM caller. Sometimes the XPCOM caller of the language will be a whole other chain of scripting language calls (eg, Python calling JavaScript), or sometimes the XPCOM caller will itself be "top-level", and do nothing with the exception (eg, some arbitary C++ code using an interface doing nothing useful when the interface call returns an NS_ERROR_ error code and has an exception set) This attempts to describe how such languages are expected to interact with the XPCOM exception mechanism. It will be used as an implementation guide for at least Javascript and Python. In Scenario (1), we expect that the language will simply use its internal exception raising mechanism to propogate the exception. The fact that an XPCOM exception occurred should not be any more significant than any other exception (ie, if a "raise/throw" statement had been exectuted.) The language will probably need to uniquely identify the exception as an XPCOM exception (to help implement scenario 2), but the exception mechanism should be generic for the language. Scenario (2) is where the interesting things generally happen - as the exception is determined "unhandled" by the language implementation. Languages will often need to do 2 interesting things here. 2.a: Print an error report. As the language has no way of determining if the top-level caller of the language will print an error when an exception is set, some diagnostic (maybe only in debug builds) will generally be printed. In Javascript, this is simply the script filename and location, and error message. In Python, this will generally be a complete traceback of the stack leading to the exception. 2.b: Ensure some XPCOM exception is set for callers that can handle XPCOM exception (such as another scripting language). Thus, when eg, Python calls Javascript, Python itself has access to the internal Javascript exception that kicked the entire process off. 2.b simply means that an XPCOM exception must be set for the caller. However, consider the case when scenario (1) actually causes scenario (2) - ie, a Javascript component is being implemented, and this implementation calls some other XPCOM component which subsequently fails. In this case, a _new_ XPCOM exception will be created, with the original XPCOM exception set as the "inner" exception. Further, this new exception may copy fields from the original exception - such as the error message and error code. For example: Below we have a Python component that calls a Javascript implemented component. This Javascript implementation itself calls the DOM. ::: # Python: class Foo: def DoSomething(self): self.js_object.UseMe(); ::: // Javascript: UseMe: function() { window.SomeDOMFucntion(); }, ::: Let's assume this final DOM call fails. The exception handling procedure is: 1) The DOM call fails, and returns to its caller (JS in this case) with an XPCOM Exception set. This XPCOM exception has no "location" information (ie, filename and linenumber are empty) 2) Javascript sees this exception, and turns it into an internal JS exception. 3) Javascript unwinds its stack, finding no exception handlers on the way up. 4) Javascript dumps an error report for the unhandled exception. It reports the filename and line that made the DOM call, and the error message in the DOM XPCOM exception. 5) Javascript creates a _new_ XPCOM exception. This new exception has the filename and linenumber of the JS code that failed. The error message in this exception will be a slightly modified version of the original exception message (ie, possibly with "JS component caused error:" prepended to the message). The original XPCOM exception is set as the "inner" exception of the new exception. 6) Javascript sets this new exception current, and returns to the caller. 7) Control returns to Python. It sees the XPCOM exception set - jump to Step (2), but using Python semantics. Thus, Python will correctly see the exception nominating the JS code as the "outer" failure, and will show the orginal XPCOM exception as the "inner" exception. 8) Steps (3) to (6) then also apply for Python. 9) Finally, control returns to the ultimate caller. This caller will generally _not_ be XPCOM Exception aware. This is fine, as all other steps in the process have printed their relevent diagnostics. I hope this makes some vague sense :) All feedback appreciated.
personally i'm heavily leaning towards the JS Console over the text (printf) console. It should probably be renamed to Scripting Console or something, and i suspect that the console could be implemented in python or some other language in some other implementation...
reassign all kandrot xpcom bug.
Assignee: kandrot → dougt
One thing that I would expect is for the exception to remain consistent across the boundaries. For instance, a JS function calls an XPCOM method, which in turns calls another JS function that throws and exception. The top level JS function has a catch. I would expect the object of the catch would contain the same properties as the object thrown or at least a superset of those properties. Essentially I would want the the logic in the catch statement to work whether the code had called JS directory or indirectly. Is this a pie in the sky hope?
It seems to me that all this 'wrapping' is completely optional and not necessarily always helpful. We are trying to have a uniform execption scheme. I'd think that we'd often just want to pass through the existing exception. Otherwise, either all the exception reporters (the things that 'explain' the expections to the user) are going to need to be able to represent all this wrapping OR the user is going to lose the information about the original execption. The more likely this loss of *visible* useful information, the less useful the uniform exception scheme becomes.
this is all Mark's. Thanks MarkH!
Assignee: dougt → MarkH
Summary: [RFE] XPCOM needs rich error reporting → XPCOM needs rich error reporting
QA Contact: kandrot → nobody
QA Contact: nobody → xpcom
I would elect to dup this bug to bug 374852, which provides better exception handling across xpcom boundaries.
At this point I'm not convinced we want better exception codes across the boundary in most cases, and every failed call would at least have to clear an prior exception (since we don't have exceptions that automatically unwind). So for the time being, WONTFIX
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.