What causes processing time differences running REST endpoint from different sources?

Background:

We are creating a SAAS app using Vue front-end, Django/DRF backend, Postgresgl, all running in a Docker environment. The benchmarks below were run on our local dev machines.

The process to register a new "owner" is rather complex. It does the following:

  • Create tenant and schema
  • Run migrations (done in the create schema process)
  • Create MinIO bucket
  • Load "production" fixtures
  • Run sync_permissions
  • Create an owner instance in the newly created schema

We are seeing some significant differences in processing times for some of the above steps running the registration process in different ways. In trying to figure out our issue, we have tried the following four methods to invoke the registration process:

  • from the Vue front-end hitting the API endpoint
  • from a REST client (Talend)
  • from the APIBrowser (provided by DRF)
  • (in some cases) via manage.py

We tried it from the REST client to try to eliminate Vue as the culprit, but we got similar times between Vue and the REST client.

We also saw similar times between the APIBrowser and the manage.py method, so in the tables below, we are comparing Talend to APIBrowser (or manage.py).

The issue:

Here are the processing times for several of the steps listed above:

|---------------------|--------|------------|--------|
| Process             | Talend | APIBrowser | Factor |
|---------------------|--------|------------|--------|
| Create Tenant       | 11.853 |    1.185   |  10.0  |
|---------------------|--------|------------|--------|
| Create MinIO Bucket |  0.386 |    0.273   |   1.4  |
|---------------------|--------|------------|--------|
| Load Fixtures       |  0.926 |    0.215   |   4.3  |
|---------------------|--------|------------|--------|
| Sync Permissions    | 61.115 |    5.390   |  11.3  |
|---------------------|--------|------------|--------|
|             Overall | 74.280 |    7.053   |  10.5  |
|---------------------|--------|------------|--------|

In both cases (Talend and APIBrowser), it is running the exact same code. We don’t understand why the REST client method takes more than 10 times as long as running from APIBrowser.

We then tried to get down to finer detail in our benchmark timing. We focused on the first step and quickly noticed that it was the process of running migrate_schemas that was the issue. Here’s a list of processing times for each migration file it processed. This time, we ran the second pass via manage.py instead of APIBrowser, but as mentioned previously, those times were comparable.

|---------------------|--------|-----------|--------|
| Migration file      | Talend | manage.py | Factor |
|---------------------|--------|-----------|--------|
| activity_log.0001   |  0.133 |   0.013   |  10.2  |
| countries.0001      |  0.086 |   0.013   |   6.6  |
| contenttypes.0001   |  0.178 |   0.022   |   8.1  |
| contenttypes.0002   |  0.159 |   0.033   |   4.8  |
| auth.0001           |  0.530 |   0.092   |   5.8  |
| auth.0002           |  0.124 |   0.022   |   5.6  |
| auth.0003           |  0.090 |   0.023   |   3.9  |
| auth.0004           |  0.097 |   0.027   |   3.6  |
| auth.0005           |  0.126 |   0.016   |   7.9  |
| auth.0006           |  0.079 |   0.006   |  13.2  |
| auth.0007           |  0.079 |   0.011   |   7.2  |
| auth.0008           |  0.100 |   0.011   |   9.1  |
| auth.0009           |  0.085 |   0.014   |   6.1  |
| auth.0010           |  0.121 |   0.015   |   8.1  |
| auth.0011           |  0.087 |   0.018   |   4.8  |
| users.0001          |  0.871 |   0.115   |   7.6  |
| admin.0001          |  0.270 |   0.035   |   7.7  |
| admin.0002          |  0.093 |   0.022   |   4.2  |
| admin.0003          |  0.091 |   0.024   |   3.8  |
| authtoken.0001      |  0.193 |   0.036   |   5.4  |
| authtoken.0002      |  0.395 |   0.090   |   4.4  |
| clients.0001        |  0.537 |   0.082   |   6.5  |
| clients.0002        |  0.519 |   0.145   |   3.6  |
| projects.0001       |  0.475 |   0.062   |   7.7  |
| projects.0002       |  0.293 |   0.062   |   4.7  |
| sessions.0001       |  0.191 |   0.023   |   8.3  |
| tasks.0001          |  0.241 |   0.122   |   2.0  |
| tenants.0001        |  0.086 |   0.017   |   5.1  |
|---------------------|--------|-----------|--------|
|         Total time: | 10.404 |   1.618   |   6.4  |
|---------------------|--------|-----------|--------|

Our Theory:

We think it must have something to do with Talend (and Vue) initiating the process from a different domain (as it will be when the site is live), but in the case of APIBrowser, it starts from the actual endpoint (i.e. the same domain) that the endpoint is defined for.

That means, in our local environment, running from Vue, we are on local.dev and it hits the local.api endpoint. But running from APIBrowser, we go directly to local.api, then fill in the data on the form and POST it.

Our theory is that it must be affecting how files are accessed. The migrate_schemas process has to open many .py files. And the worst culprit, SyncPermissions, is processing many .yaml files where we have defined our default permission structure utilized by each tenant. I should point out that the LoadFixtures process also opens external .yaml files, but in this case, it only has one file to process, so the difference is minimized.

It may be like the difference between opening an image file in code vs. a template showing an image via HTML. In the HTML version, it’s essentially another request on the server – which surely takes longer than programmatically opening an image on disk.

What we don’t understand is why opening files in these processes would be affected by the two methods of initiating the process. Obviously, since the site will have to run in Vue, having the registration process take 70 seconds when we know it could be done in only 7 seconds is unacceptable. Any help solving this issue would be greatly appreciated.

Note:

I realize it is the norm here in SO to include code for the process in question, but in this case, both processes are running the exact same code – which is why I decided not to post several hundred lines of code here.

Source: Docker Questions