Open Bug 1507282 (backtrack) Opened 2 years ago Updated 7 months ago

Backtrack 1.0

Categories

(Core :: mozglue, task, P5)

task

Tracking

()

Tracking Status
firefox65 --- affected

People

(Reporter: mayhemer, Unassigned)

References

(Depends on 3 open bugs, Blocks 1 open bug, )

Details

(Whiteboard: [webcompat-sci-exclude])

Attachments

(1 file, 34 obsolete files)

47 bytes, text/x-phabricator-request
Details | Review
Attached patch wip 1 (m-c@073045259e75) (obsolete) — Splinter Review
Backtrack is a new perf tool with focus on task chaining and scheduling.

Filing in mozglue and starting over, this work was originally tracked in bug 1148204 that is overly polluted with the previous prototype now.

Things left to do, new bugs will be filed:


High priority:
- add input tracking for linux and osx
- have a runtime switch
- write two profiles compare code
- instrument rust based code: stylo, web render

Medium priority:
- optimize buffering and persisting the tracking data (all threads/all processes)
- measure and optimize the overhead
- instrument js compiler

Lower priority:
- less hacky timestamp sync
- track js promises, timers
- instrument even more code here and there


I will provide builds to play with and publish the analyzer webapp (github/instance).
Alias: backtrack
could you make the gdoc open for public viewing OR give me access for read-only?
(In reply to Mayank Bansal from comment #1)
> could you make the gdoc open for public viewing OR give me access for
> read-only?

Maybe later when it's less drafted.
Attached patch wip 2 (m-c@d99bf39f5223) (obsolete) — Splinter Review
Attachment #9025153 - Attachment is obsolete: true
Depends on: 1501438
Attached patch wip 3 (m-c@d99bf39f5223) (obsolete) — Splinter Review
Attachment #9029493 - Attachment is obsolete: true
Depends on: 1514801
Attached patch wip 4 (m-c@c38131baf660) (obsolete) — Splinter Review
Attachment #9030502 - Attachment is obsolete: true
Attached patch wip 5 (m-c@98b5a78c6cc9) (obsolete) — Splinter Review
Attachment #9032019 - Attachment is obsolete: true
Attached patch wip 6 (m-c@98b5a78c6cc9) (obsolete) — Splinter Review
Attachment #9032943 - Attachment is obsolete: true
Attached patch wip 7 (m-c@98b5a78c6cc9) (obsolete) — Splinter Review
Attachment #9032991 - Attachment is obsolete: true
Attached patch wip 8 (m-c@98b5a78c6cc9) (obsolete) — Splinter Review
Attachment #9033094 - Attachment is obsolete: true
Attached patch wip 9 (m-c@619d084f0891) (obsolete) — Splinter Review
build and runs on linux
Attachment #9033987 - Attachment is obsolete: true
Attachment #9034487 - Attachment description: wip 8 (m-c@619d084f0891) → wip 9 (m-c@619d084f0891)
Attached patch wip 10 (m-c@cf9e7037ac4c) (obsolete) — Splinter Review
Attachment #9034487 - Attachment is obsolete: true
Attached patch wip 11 (m-c@cf9e7037ac4c) (obsolete) — Splinter Review
Attachment #9035710 - Attachment is obsolete: true
Attached patch wip 12 (m-c@cf9e7037ac4c) (obsolete) — Splinter Review
Attachment #9035920 - Attachment is obsolete: true
Attached patch wip 13 (m-c@0e9d5fd6ff74) (obsolete) — Splinter Review

Looks like I'm missing tracking of stuff that gets triggered from inside PresShell::DoFlushPendingNotifications

Attachment #9036949 - Attachment is obsolete: true
Attached patch wip 14 (m-c@0e9d5fd6ff74) (obsolete) — Splinter Review
  • added DOM node repaint cause tracking (more work to get to the actual source point needed!)
  • added tracking of thread before-/after-event observers, visible as dispatch blockers now
Attachment #9045375 - Attachment is obsolete: true
Attached patch wip 15 (m-c@0e9d5fd6ff74) (obsolete) — Splinter Review
  • nits fixed, added some more runtime assertions checks, raised new ideas (written down in the work breakdown)
Attachment #9047092 - Attachment is obsolete: true
Attached patch wip 16 (m-c@1d783ed68779) (obsolete) — Splinter Review
  • XPCOM service exposing BT API (just objective for now)
  • fixed potential deadlock in timer
  • xpcshell bt startup
  • udp/server socket tracking
Attachment #9047378 - Attachment is obsolete: true
Depends on: 1538241
Attached patch wip 17 (m-c@1d783ed68779) (obsolete) — Splinter Review
  • few build issues fixed on linux
Attachment #9052259 - Attachment is obsolete: true
Attached patch wip 18 (m-c@ 20ab49faf0e1) (obsolete) — Splinter Review
  • rebased, after that only basic testing
  • builds and runs on try (not verified after rebase), modulo few details to tackle down
  • no need to disable sandboxing on win to enable bt logging from content (exceptions added)
Attachment #9054159 - Attachment is obsolete: true
Attached patch wip 19 (m-c@ 20ab49faf0e1) (obsolete) — Splinter Review
  • build again on try
  • issues left to have fully green builds:
    • use only thread_local static storage for my tls (blocked on: deque allocates dynamically and crashes in free() on a thread shutdown on win, could be std impl bug)
    • might be related to previous: fix the check_vanilla_allocations.py test
    • fix sm-pkg build, we have to include mozglue/backtrack somewhere, probably
Attachment #9056153 - Attachment is obsolete: true
Attached patch wip 20 (m-c@87829648b0e5) (obsolete) — Splinter Review
Attachment #9058296 - Attachment is obsolete: true
Depends on: 1548079
Whiteboard: [webcompat-sci-exclude]
Attached patch wip 21 (m-c@cb5734727c0a) (obsolete) — Splinter Review

only rebased

Attachment #9060940 - Attachment is obsolete: true
Type: defect → task
Priority: -- → P5
Attached patch wip 22 (m-c@d8321d0b2c5b) (obsolete) — Splinter Review

rebased, tested only locally on windows.

Attachment #9064512 - Attachment is obsolete: true
Attached patch wip 23 (m-c@d8321d0b2c5b) (obsolete) — Splinter Review

few small nits fixed

Attachment #9068500 - Attachment is obsolete: true
Attached patch wip 24 (m-c@d8321d0b2c5b) (obsolete) — Splinter Review
  • ocsp blocking tracked
Attachment #9068674 - Attachment is obsolete: true
Attached patch wip 25 (m-c@43b2d57b25cc) (obsolete) — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=feec6866213eaf143237aece33e082fc6344777a

  • added MOZ_LOG=backtrack option to write data to logs instead of to a separate file; BTa (the analyzer) doesn't understand it tho, but logan does when you checkout a special branch
  • added tracking of streams and session queues in http/2 stack
Attachment #9068756 - Attachment is obsolete: true
Comment on attachment 9071718 [details] [diff] [review]
wip 25 (m-c@43b2d57b25cc)

Nathan, can you please overlook the mozglue/backtrack/Backtrack.h file?  I wanted to write a more in-depth documentation before giving this to you, but at least there is an API doc [1], that describes what this has to do and how it's intended to be used.  The code doesn't have many comments of its own (subject to change, of course).  Please reach me when things are not clear.

Thanks!

[1] https://docs.google.com/document/d/1_kDvAeOFbRr9zzAcnJYWQIq5fvOQxyeE_bSkJZ5lBgQ/edit?usp=sharing
Attachment #9071718 - Flags: feedback?(nfroyd)
Attached patch wip 26 (m-c@ad05396bfeed) (obsolete) — Splinter Review

rebase only

Attachment #9071718 - Attachment is obsolete: true
Attachment #9071718 - Flags: feedback?(nfroyd)
Attached patch wip 27 (m-c@ad05396bfeed) (obsolete) — Splinter Review

number of build fixes and runtime assertion fixes, backup

Attachment #9076871 - Attachment is obsolete: true
Attached patch wip 28 (m-c@5805cd9ae294) (obsolete) — Splinter Review

Nathan, please see Comment 27.

Attachment #9078996 - Attachment is obsolete: true
Attachment #9080634 - Flags: feedback?(nfroyd)
Comment on attachment 9080634 [details] [diff] [review]
wip 28 (m-c@5805cd9ae294)

Review of attachment 9080634 [details] [diff] [review]:
-----------------------------------------------------------------

A lot of the documentation in the Backtrack project documentation needs to move into Backtrack.h.

It's not clear to me what the description of "dependent queues" in the API documentation is intended for.  Is the idea that you need to track timelines that may include tasks from different threads, not just tasks executing on a single thread?  Do people somehow need to be aware of dependent queues to use the API, or is the idea of dependent queues more of an implementation detail?

Is it accurate to say that Backtrack is really just extending TaskTracer to recognize there are more sources of asynchronous task execution than just the IPC message queue and thread event queues?

This patch doesn't contain any bits for the "critical path branches" (aren't those usually called "blockers"?) and the "annotate sequential tasks" bits for existing code optimization, mentioned in the project documentation, correct?

I don't understand why `mozilla::Runnable` isn't touched in any way.  (Or `nsIEventTarget` and related interfaces.)  I guess...rather than modifying all the individual tasks, we're just trying to instrument the places that actually deal with the intake and execution of tasks, with the expectation that the latter involves modifying a smaller number of places?

I am having a lot of difficulty switching back and forth between the project documentation, the API documentation, and the actual patch, trying to tie details from one to the other.  I have written some comments below, many of which are admittedly nitpicky about the details of the code itself, rather than the broader "this code looks fine as-is" vs. "this code is architecturally incorrect", which requires integrating the aforementioned three sources.  Assuming my understanding of Backtrack as a more extensive TaskTracer, above, is correct, I think the ideas are fine, but I wouldn't agree that this patch could land in its current state.

::: mozglue/backtrack/Backtrack.cpp
@@ +53,5 @@
> +namespace {  // ::{anon}
> +
> +uint32_t CurrentThreadId() {
> +  // Because of thread id recycling we must add something unique or random
> +  // This modified tid will then spread automatically

Wait, what?  Why not just use an incrementing counter in this case?  Actually, why is using the clock for the upper sixteen bits not subject to the same concerns about thread recycling?

@@ +125,5 @@
> +                            void* aClosure) {
> +  FILE* stream = (FILE*)aClosure;
> +  MozCodeAddressDetails details;
> +  static const size_t buflen = 1024;
> +  char buf[buflen + 1];  // 1 for trailing '\n'

Maybe just make `buflen` long enough and then you don't have to play the `+1` games here and below?

::: mozglue/backtrack/Backtrack.h
@@ +177,5 @@
> +MFBT_API bool Enabled(uint32_t feature = kFeatureEnabled);
> +
> +MFBT_API gid_t const RunningMarker();
> +
> +inline MOZ_MUST_USE auto RootSpan() {

We generally do not use `auto` as a return type.  I do not think it's good practice to use it as a return type without explicitly listing the returned type (e.g. `auto foo() -> ReturnType`).

I understand that writing out the return type here is painful, but that seems to be because `AutoRootMark` requires template parameters, rather than just having a two-arg constructor that would accomplish the same thing.  Can we fix that, and then the `auto` return type stuff would go away on its own?

@@ +203,5 @@
> +  detail::AutoRootMark<MarkerType::LABEL_BEGIN, MarkerType::LABEL_END> span;
> +  return span;
> +}
> +
> +inline auto Dispatch(gid_t& g) {

Same thing here for `detail::Marker`.

@@ +251,5 @@
> +
> +inline MOZ_MUST_USE auto ResponseSpan(gid_t& g) {
> +  detail::AutoSpanMark<MarkerType::RESPONSE_BEGIN, MarkerType::RESPONSE_END>
> +      resposnse(g);
> +  return resposnse;

Nit: `response`.  Also the above comments apply to `AutoSpanMark`.

@@ +258,5 @@
> +inline MOZ_MUST_USE auto ResponseSpanAndClear(gid_t& g) {
> +  detail::AutoSpanMarkAndClear<MarkerType::RESPONSE_BEGIN,
> +                               MarkerType::RESPONSE_END>
> +      resposnse(g);
> +  return resposnse;

Nit: `response`.

@@ +282,5 @@
> +  return marker;
> +}
> +
> +inline auto Milestone() {
> +  detail::Marker<gid_t> marker(detail::Mark<MarkerType::MILESTONE>());

It's not clear what a `MILESTONE` vs. an `OBJECTIVE` (above) is.

@@ +437,5 @@
> +/**
> + * Provides blocker tracking but only on a single handling thread.
> + */
> +class SingleThread : public NameQueue {
> +  gid_t mLastExecute = {0, 0};

`= gid_t()`?

@@ +473,5 @@
> + * thread (e.g. a thread pool)
> + */
> +class ThreadPool : public NameQueue {
> +  // TODO convert to Atomic<gid_t>
> +  gid_t mLastExecute = {0, 0};

`= gid_t()`?

@@ +501,5 @@
> + */
> +template <typename trails_t = SingleThread>
> +class Safe {
> +  class RefCountedTrails : public trails_t {
> +    Atomic<int32_t, Relaxed> mRefCnt;

Um.  This cannot possibly be correct.  Maybe it is correct by accident on x86-ish processors, but it's going to break virtually everywhere else.  Maybe it is correct on a single thread, but then why would this be `Atomic`?

@@ +507,5 @@
> +   public:
> +    // Has to be implemented on its own before we have
> +    // MOZ_INLINE_DECL_THREADSAFE_REFCOUNTING
> +    int32_t AddRef() { return ++mRefCnt; }
> +    int32_t Release() {

Let's not add a third implementation of `AddRef`/`Release` (one/several in XPCOM, one in MFBT).

@@ +788,5 @@
> + public:
> +  explicit TrackedTask() = default;
> +  explicit TrackedTask(char const* queueName) : Base(queueName) {}
> +  explicit TrackedTask(TaskContainer&& task) {
> +    Base::TrackDispatch();

Why is this not done in the `Base` constructor?

@@ +793,5 @@
> +    mTask.emplace(std::move(task));
> +  }
> +  explicit TrackedTask(TaskContainer&& task, Trails const& trails)
> +      : Base(trails) {
> +    Base::TrackDispatch();

Why is this not done in the `Base` constructor?

@@ +802,5 @@
> +    release();
> +
> +    mTask = std::move(other.mTask);
> +    Base::mTrails = std::move(other.mTrails);
> +    Base::mMarker = other.mMarker;

Maybe we should just call `Base::operator=` here?

@@ +812,5 @@
> +    release();
> +
> +    mTask = other.mTask;
> +    Base::mTrails = other.mTrails;
> +    Base::mMarker = other.mMarker;

Here too.

@@ +916,5 @@
> +                       TrackedTask<TaskContainer> const& container) {
> +  return &task == &container.const_task();
> +}
> +
> +template <typename TaskContainer, size_t PageSize = 512,

The default needs to be adjusted because 512 * sizeof(void*) + extra bookkeeping means that you're *just* exceeding a jemalloc size class, causing internal fragmentation.  The default needs to be adjusted downward to account for the space taken up by internal members...and accordingly, the default shouldn't really be user-visible.

@@ +1097,5 @@
> +      --detail::sPageCounter;
> +#endif
> +    }
> +
> +    void Enqueue(Task&& task, UniquePtr<Page>& head) {

This is a public API?

::: security/manager/ssl/nsNSSCallbacks.cpp
@@ +136,5 @@
>  }
>  
>  nsresult OCSPRequest::DispatchToMainThreadAndWait() {
> +  backtrack::Milestone().DynamicName(
> +      "OCSPRequest::DispatchToMainThreadAndWait [%s]", mAIALocation.get());

Maybe we could just make `Marker` overloads, one of which takes a constant character array (string constants), and one that takes `printf`-style args, rather than constructing something and calling `DynamicName`?  I guess we'd need to add overloads for all the sub-type constructors, which would be tedious, but at least we'd statically ensure nobody forgets a name.
Attachment #9080634 - Flags: feedback?(nfroyd) → feedback+

(In reply to Nathan Froyd [:froydnj] from comment #31)

Comment on attachment 9080634 [details] [diff] [review]
wip 28 (m-c@5805cd9ae294)

Review of attachment 9080634 [details] [diff] [review]:

A lot of the documentation in the Backtrack project documentation needs to
move into Backtrack.h.

I wanted to have decent technical doc before asking you for a feedback the first time, but time pressure did not allow me to have one.

It's not clear to me what the description of "dependent queues" in the API
documentation is intended for. Is the idea that you need to track timelines
that may include tasks from different threads, not just tasks executing on a
single thread?

Dependent queues have nothing to do with multi threading. Dependent queues (not sure the name was chosen well) is a pattern you will find in nearly any at least a bit asynchronous software code - whenever you put something aside for later. To explain - an INdependent queue is a queue that when you dispatch to it, will do its best to run the task ASAP in perhaps a FIFO manner, no conditions. Tasks put in a dependent queue though must usually wait for some other (async) code or task chain to finish first and then 'manually' trigger the loop over this dependent queue and run the queued tasks in it. Hence "dependent". Technically, when a task is run from inside another tasks (nesting) it's considered dependent. Note that thread looping cores are surrounded by LOOP_BEGIN and LOOP_END markers that remove (hides) any nesting. Hence, tasks running from inside 'manual' call to NS_ProcessNextEvent() are not considered dependent.

If you ignore dependency triggers (you don't go deep in the nesting) you will get a single path of nodes, no branching (I call it the shallow critical path). This single path may (will) skip among threads/processes. If you start looking at dependency triggers as well, you will be getting a tree. Again, the added branches will skip from thread to thread. To visualize this, you can either draw some kind of connection arrows between threads or show the main (shallow) critical path as one timeline and each dependency triggering path (in the example from the doc it would go back to vsync tick OS notification) as a separate timeline.

Do people somehow need to be aware of dependent queues to
use the API, or is the idea of dependent queues more of an implementation
detail?

It's a detail, really. Maybe only concern may be to add proper 'loop' markers to root thread looping code (like in ImageDecoder). W/o it all the decoding tasks are seen as dependent and when you look at its examination, you just see what the loop was doing (immediately processing tasks) which is not a valuable information at all, and it's rather confusing.

Is it accurate to say that Backtrack is really just extending TaskTracer to
recognize there are more sources of asynchronous task execution than just
the IPC message queue and thread event queues?

Yes, TT was a big inspiration for me and the main idea remains the same.

This patch doesn't contain any bits for the "critical path branches" (aren't
those usually called "blockers"?)

No, no code for branches (or however we may call them eventually) in this patch yet.

"critical path branches" are multiple chains of async tasks that all have to complete to unblock some state evolution (like load all css files before you can paint)
"Blockers" are events/tasks that simply make your task wait in a FIFO queue

and the "annotate sequential tasks" bits
for existing code optimization, mentioned in the project documentation,
correct?

As well, this is another next step thing to do.

I don't understand why mozilla::Runnable isn't touched in any way. (Or
nsIEventTarget and related interfaces.) I guess...rather than modifying
all the individual tasks, we're just trying to instrument the places that
actually deal with the intake and execution of tasks, with the expectation
that the latter involves modifying a smaller number of places?

Exactly. If you wonder why I do wrapping of tasks (generally, not just nsIRunnables) is that I also need to track destructor code same way as the main Run() method. Destructors may do stuff (block CPU) and also may dispatch other tasks and we can't loose the linking.

Updating every runnable everywhere is overkill and puts unnecessary work on implementers and is not errorproof.

I am having a lot of difficulty switching back and forth between the project
documentation, the API documentation, and the actual patch, trying to tie
details from one to the other. I have written some comments below, many of
which are admittedly nitpicky about the details of the code itself, rather
than the broader "this code looks fine as-is" vs. "this code is
architecturally incorrect", which requires integrating the aforementioned
three sources. Assuming my understanding of Backtrack as a more extensive
TaskTracer, above, is correct, I think the ideas are fine, but I wouldn't
agree that this patch could land in its current state.

Sure, I believe there is much to update.

::: mozglue/backtrack/Backtrack.cpp
@@ +53,5 @@

+namespace { // ::{anon}
+
+uint32_t CurrentThreadId() {

  • // Because of thread id recycling we must add something unique or random
  • // This modified tid will then spread automatically

Wait, what? Why not just use an incrementing counter in this case?

No. There would still be a possibility for two threads on different process to share TID and also hit the same (per process) counter.

TIDs are quite open, though. I need to sync with the Gecko profiler on this. For the code as is now (which is not something to land) I want to have something per-session unique, that's mostly the only requirement here that makes things very simple then.

Anyway, this will change before landing.

Actually, why is using the clock for the upper sixteen bits not subject to
the same concerns about thread recycling?

@@ +125,5 @@

  •                        void* aClosure) {
    
  • FILE* stream = (FILE*)aClosure;
  • MozCodeAddressDetails details;
  • static const size_t buflen = 1024;
  • char buf[buflen + 1]; // 1 for trailing '\n'

Maybe just make buflen long enough and then you don't have to play the
+1 games here and below?

::: mozglue/backtrack/Backtrack.h
@@ +177,5 @@

+MFBT_API bool Enabled(uint32_t feature = kFeatureEnabled);
+
+MFBT_API gid_t const RunningMarker();
+
+inline MOZ_MUST_USE auto RootSpan() {

We generally do not use auto as a return type. I do not think it's good
practice to use it as a return type without explicitly listing the returned
type (e.g. auto foo() -> ReturnType).

I understand that writing out the return type here is painful,

yep, auto is here as a shortcut.

but that
seems to be because AutoRootMark requires template parameters, rather than
just having a two-arg constructor that would accomplish the same thing. Can
we fix that, and then the auto return type stuff would go away on its own?

I'm not sure I follow the idea for a two-arg ctor here. But I can definitely change to comply more with our style.

OTOH, I was more expecting a different concern here. The API is not 100% safe to use as I wrote it (main design driver was to save typing when the API is used around the code base).

you can easily write:

auto root(backtrack::RootSpan().StaticName("some label"));

the code will compile, because RootSpan() returns a Marker (via copy elision) and StaticName() returns Marker& (a ref). But the object returned by RootSpan() will only be temporary and destroyed, not mentioning that access to root will crash. I think one solution would be to make StaticName() and other Marker methods simply be void and thus disallow chaining. But the code may still compile as void still is a type?

@@ +203,5 @@

  • detail::AutoRootMark<MarkerType::LABEL_BEGIN, MarkerType::LABEL_END> span;
  • return span;
    +}

+inline auto Dispatch(gid_t& g) {

Same thing here for detail::Marker.

@@ +251,5 @@

+inline MOZ_MUST_USE auto ResponseSpan(gid_t& g) {

  • detail::AutoSpanMark<MarkerType::RESPONSE_BEGIN, MarkerType::RESPONSE_END>
  •  resposnse(g);
    
  • return resposnse;

Nit: response. Also the above comments apply to AutoSpanMark.

@@ +258,5 @@

+inline MOZ_MUST_USE auto ResponseSpanAndClear(gid_t& g) {

  • detail::AutoSpanMarkAndClear<MarkerType::RESPONSE_BEGIN,
  •                           MarkerType::RESPONSE_END>
    
  •  resposnse(g);
    
  • return resposnse;

Nit: response.

@@ +282,5 @@

  • return marker;
    +}

+inline auto Milestone() {

  • detail::Marker<gid_t> marker(detail::Mark<MarkerType::MILESTONE>());

It's not clear what a MILESTONE vs. an OBJECTIVE (above) is.

Sure. OBJECTIVE is intended to be picked in the (current) offline analyzer as a starting point (or ending, actually) to go back from. MILESTONE is only an informative thing and it was introduced to filter out all other overwhelming data from paths. It was an attempt to make comparing of two paths simpler. I think we still may find MILESTONES useful for data analyzes.

@@ +437,5 @@

+/**

    • Provides blocker tracking but only on a single handling thread.
  • */
    +class SingleThread : public NameQueue {
  • gid_t mLastExecute = {0, 0};

= gid_t()?

@@ +473,5 @@

    • thread (e.g. a thread pool)
  • */
    +class ThreadPool : public NameQueue {
  • // TODO convert to Atomic<gid_t>
  • gid_t mLastExecute = {0, 0};

= gid_t()?

@@ +501,5 @@

  • */
    +template <typename trails_t = SingleThread>
    +class Safe {
  • class RefCountedTrails : public trails_t {
  • Atomic<int32_t, Relaxed> mRefCnt;

Um. This cannot possibly be correct.

You are referring to the Relaxed ordering? AFAIK, for ref counters, this is quite sufficient. The arithmetic is atomic. I'm not sure about the visibility, but I believe only one thread will see the final 0. And because of Illusion of atomic reference counting access has to be synchronized when manipulating an object at the same referring smart pointer. As this is tight close to the queue manipulation (which is expected to be used on multiple threads), there is an expectation such synchronization will be present.

But please fix me if I'm wrong :)

Maybe it is correct by accident on
x86-ish processors, but it's going to break virtually everywhere else.
Maybe it is correct on a single thread, but then why would this be Atomic?

@@ +507,5 @@

  • public:
  • // Has to be implemented on its own before we have
  • // MOZ_INLINE_DECL_THREADSAFE_REFCOUNTING
  • int32_t AddRef() { return ++mRefCnt; }
  • int32_t Release() {

Let's not add a third implementation of AddRef/Release (one/several in
XPCOM, one in MFBT).

Don't like it too, but I had to. There is nothing in MFBT last time I have checked that could be used easily on templated classes. I was using external::AtomicRefCounted before but class size checking was failing - because of the template.

@@ +788,5 @@

  • public:
  • explicit TrackedTask() = default;
  • explicit TrackedTask(char const* queueName) : Base(queueName) {}
  • explicit TrackedTask(TaskContainer&& task) {
  • Base::TrackDispatch();

Why is this not done in the Base constructor?

If you have to ask, then the documentation was not clear enough about how Tracked<> and TrackedTask<> work :(

The Tracked<> class expects TrackDispatch() (and TrackExecute()) be always called fully manually. The TrackedTask<> is intended to hide calls to Track*() method completely (I could even try to derive from Tracked via protected, perhaps) and the task being wrapped is treated as 'dispatched' the moment we create this wrapper around it. As in the doc, the intended use is to, instead of mQueue.Append(move(task)), insert the task simply as mQueue.Append(TrackedTask(move(task))).

@@ +793,5 @@

  • mTask.emplace(std::move(task));
  • }
  • explicit TrackedTask(TaskContainer&& task, Trails const& trails)
  •  : Base(trails) {
    
  • Base::TrackDispatch();

Why is this not done in the Base constructor?

@@ +802,5 @@

  • release();
  • mTask = std::move(other.mTask);
  • Base::mTrails = std::move(other.mTrails);
  • Base::mMarker = other.mMarker;

Maybe we should just call Base::operator= here?

@@ +812,5 @@

  • release();
  • mTask = other.mTask;
  • Base::mTrails = other.mTrails;
  • Base::mMarker = other.mMarker;

Here too.

@@ +916,5 @@

  •                   TrackedTask<TaskContainer> const& container) {
    
  • return &task == &container.const_task();
    +}

+template <typename TaskContainer, size_t PageSize = 512,

The default needs to be adjusted because 512 * sizeof(void*) + extra
bookkeeping means that you're just exceeding a jemalloc size class,
causing internal fragmentation. The default needs to be adjusted downward
to account for the space taken up by internal members...and accordingly, the
default shouldn't really be user-visible.

Very good point! I found out (after I've already wrote this class :/) that the xpcom internal thread queue is using 256 or even less page size. I definitely wanted to make this smaller here too, yes. I made this a template arg because on number of places I replace nsTArrays with bt::Queue there is an optimization for the array allocation size, sometime to even as low as 16 elements. So I think it may be a good optimization?

@@ +1097,5 @@

  •  --detail::sPageCounter;
    

+#endif

  • }
  • void Enqueue(Task&& task, UniquePtr<Page>& head) {

This is a public API?

No - isn't the Page sub-class private?

::: security/manager/ssl/nsNSSCallbacks.cpp
@@ +136,5 @@

}

nsresult OCSPRequest::DispatchToMainThreadAndWait() {

  • backtrack::Milestone().DynamicName(
  •  "OCSPRequest::DispatchToMainThreadAndWait [%s]", mAIALocation.get());
    

Maybe we could just make Marker overloads, one of which takes a constant
character array (string constants), and one that takes printf-style args,
rather than constructing something and calling DynamicName?

Originally I wanted to have only Name, overloaded for Name(char const* static_name) and Name(char const* fmt, ...); But the compiler sees it as ambiguous.

I guess we'd
need to add overloads for all the sub-type constructors, which would be
tedious, but at least we'd statically ensure nobody forgets a name.

I invented this 'amend' approach to have a variable length markers. Not all marker types need names, some will have more than one. Some also have special data appended. But I think for e.g. Milestone and Objective (and some others) would be better to pass the args to the Milestone function directly, yes. Note that most naming will be combination of static and dynamic. There will always be a method name (static) and a resource (URL, ...) in pair. So in the end, I'd see (schematically) something like: Milestone().StaticName(__func__).DynamicName("%s", url.spec()); compressed to Milestone(__func__, "%s", url.spec()); and similarly for other marker types, perhaps. This will save memory a lot when the static part (func names) will not be in the formating string.

Thanks for the feedback. Before I ask again, I'll try to fix as much as possible according your comment and also build a better technical doc on how the API works.

Attachment #9080634 - Attachment is obsolete: true
Attached file Bug 1507282 - Backtrack baseline code (obsolete) —

Depends on D40467

Depends on D40468

Depends on D42290

Depends on D42291

Depends on D42292

Attachment #9082700 - Attachment is obsolete: true
Attachment #9082701 - Attachment is obsolete: true
Attachment #9086010 - Attachment is obsolete: true
Attachment #9086011 - Attachment is obsolete: true
Attachment #9086012 - Attachment is obsolete: true
Attachment #9086013 - Attachment is obsolete: true
Assignee: honzab.moz → nobody
Status: ASSIGNED → NEW
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED

No, sorry bot, I'm not gonna work on this.

Assignee: honzab.moz → nobody
Status: ASSIGNED → NEW
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Assignee: honzab.moz → nobody
Status: ASSIGNED → NEW
Whiteboard: [webcompat-sci-exclude] → [webcompat-sci-exclude][no-nag]
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Assignee: honzab.moz → nobody
Status: ASSIGNED → NEW
Whiteboard: [webcompat-sci-exclude][no-nag] → [webcompat-sci-exclude]
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Assignee: honzab.moz → nobody
Status: ASSIGNED → NEW
You need to log in before you can comment on or make changes to this bug.