Closed Bug 1130844 Opened 9 years ago Closed 8 years ago

Connection errors on a specific page in mana while others work

Categories

(Infrastructure & Operations :: Infrastructure: Other, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Fallen, Assigned: jabba)

Details

I am getting strange errors on a specific mana page:

https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=43716258

While this one works:

https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=43716298

I tested with Firefox Nightly and Chrome. Firefox gives me Secure Connection Failed, the connection to the server was reset while the page was loading. Chrome gives me "No data".
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/547]
Moving this over to the folks who manage Mana ....
Assignee: server-ops-webops → infra
Component: WebOps: IT-Managed Tools → Infrastructure: Other
QA Contact: nmaul → jdow
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/547]
I'm able to reproduce near-timeouts and extended page load times on both of the above URLs.
Jabba and I are talking about this one in person.

We've identified a configuration mismatch between Zeus and Confluence that prevents you from seeing the error Confluence is emitting for the first link.

We've also found the error (XHTML generation exceeded 120 seconds) that is causing the first, but not the second, page to fail to work.

Zeus times out at 30 seconds, but Confluence times out (for XHTML generation) at 120 seconds. Zeus should be set to ~5 seconds higher than Confluence, and I think the Confluence default of 120 seconds is nonsensical - either a page loads within 30 seconds, or we need to get an error report.
https://mana.mozilla.org/wiki/pages/editpage.action?pageId=43716258

Viewing the first link in the Confluence editor, it is pages and pages of nested tables and images, repeated for every possible mobile device we support. Whatever else we do on the server side, I *strongly* advise that this page be broken up into multiple pages, one per device, to reduce the load on the server and also make it easier to use for everyone.
A new confluence infrastructure is being stood up right now, including better timeout settings, newer version, faster hardware, etc. Look for that announcement in the coming weeks and after the upgrade, if this particular issue is still there, please re-open this bug at that point.
Assignee: infra → jdow
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.