Closed Bug 1851909 Opened 2 years ago Closed 2 years ago

speed up upload and fileupload admin pages

Categories

(Tecken :: General, defect, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

Details

Attachments

(2 files)

As part of bug #1746940, I added UploadAdmin and FileUploadAdmin and they are hilariously slow in production because those tables are enormous.

We should switch how the admin is doing pagination to better work with large tables.

Consider looking at django-admin-cursor-paginator.

https://pypi.org/project/django-admin-cursor-paginator/

Grabbing this now because it's painful.

Assignee: nobody → willkg
Status: NEW → ASSIGNED

This got deployed to prod in bug #1854172. The fileupload admin page is still very slow. It takes 18s to load.

I'm adjusting the scope of this bug to cover improving those admin pages. I'll look at what we can do next.

Summary: switch to django-admin-cursor-paginator → speed up upload and fileupload admin pages

In bug #1853981, we copied the prod db to stage, so stage has a large upload and fileupload table. That makes it possible to test changes in stage and get a vague idea of what the change might do in prod.

Unscientific timings on stage with virtually no load before landing PR 2805:

what t1 t2 t3 t4 t5
fileupload page 17.0s 6.4s 7.0s 6.5s 6.1s
fileupload page with query (libxul.so) 19.5s 19.9s 19.8s 19.6s 19.6s

Unscientific timings on stage with virtually no load after landing PR 2805 and a stage deploy:

what t1 t2 t3 t4 t5
file upload page 5.9s 6.0s 6.1s 6.3s 6.1s
file upload page 2 5.6s 6.0s 5.8s 6.2s 6.8s
file upload page with query (libxul.so) 19.2s 19.7s 19.6s 20.0s 19.5s

Seems like there's a slight <1s improvement. It's not worse. We'll see how it is in prod.

Going back a bit, the fileupload page is doing this:

SELECT "upload_fileupload"."id", "upload_fileupload"."upload_id",  
"upload_fileupload"."bucket_name", "upload_fileupload"."key",  
"upload_fileupload"."update", "upload_fileupload"."compressed",  
"upload_fileupload"."size", "upload_fileupload"."completed_at",  
"upload_fileupload"."created_at", "upload_fileupload"."microsoft_download",  
"upload_fileupload"."debug_filename", "upload_fileupload"."debug_id",  
"upload_fileupload"."code_file", "upload_fileupload"."code_id",  
"upload_fileupload"."generator"  
FROM "upload_fileupload"  
ORDER BY "upload_fileupload"."id"  
DESC LIMIT 101; args=()  

Explain for that is:

Limit  (cost=12.37..12.52 rows=60 width=1243)  
   ->  Sort  (cost=12.37..12.52 rows=60 width=1243)  
         Sort Key: id DESC  
         ->  Seq Scan on upload_fileupload  (cost=0.00..10.60 rows=60 width=1243)  

That suggests this is doing a table scan. There is an index on the primary key:

public | upload_fileupload_created_at_bd418669                          | index | postgres | upload_fileupload
public | upload_fileupload_pkey                                         | index | postgres | upload_fileupload
public | upload_fileupload_upload_id_aa5b1856                           | index | postgres | upload_fileupload

Am I misunderstanding that? Shouldn't it be using the upload_fileupload_pkey index and not doing a table scan?

It can't use the index because the index only contains the id and the query is pulling back all the data to create FileUpload instances to use for the listing page.

The next step is probably to remove the admin view and add update functionality to the ui.

This was deployed to prod in bug #1856685.

what t1 t2 t3
file upload page 14s 13s 14.5s
file upload page 2 13s 13s 14s
file upload page with query (libxul.so) 35s 35s 35s

That's not great, but it's an admin page that only I'm touching, so it's good enough for now. Marking as FIXED.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: