Glitch in database detail grid with the powa-web-git image #14

rjuju · 2024-07-27T11:48:36Z

@pgiraud I don't know if you can reproduce this problem.

On my side, trying the powa_remote_mode.yml compose file (I just pushed the missing POWA_REMOTE_PORT on this one and another compose file + the use of v3 format) I have a normal looking instance page, e.g.:

But if I shutdown the compose file and restart it, for some reason the db detail top row get messed up and the last few columns are pushed out as if there was a wrong colspan:

Just restarting the powa-git-web container is not enough, I need a full "podman-compose up && podman-compose up" for that. And once done, it stays broken no matter what I do, the only way to fix it is to remove the image and fetch it again.

Can you reproduce the problem? If yes, can you fix it?

pgiraud · 2024-07-27T15:02:34Z

Unfortunately, I'm not able to reproduce the problem.

What I did:

prune everything I could (volumes, images, containers, system)
ran podman-compose -f compose/powa_remote_mode.yml up (not detached)
open the browser for the "primary" server (I had to go to Configuration to reload the collector because it's marked as stopped),
waited for a few seconds,
pressed Ctrl-C to stop the compose,
ran the same command again.

Everything displays as expected.

How did you get the "tpc" and "obvious" databases? I don't have them when using the powa_remote_mode.yml compose file. I thought that it was only avaiable with the demo workload image.

rjuju · 2024-07-27T15:18:07Z

I'm not sure what the ctrl-c exactly does. Can you try

podman-compose -f compose/powa_remote_mode.yml up -d
# wait until it's up and got a couple of snapshot
podman-compose -f compose/powa_remote_mode.yml down -t0

podman-compose -f compose/powa_remote_mode.yml up -d

which is what I'm usually using, if that matters.

How did you get the "tpc" and "obvious" databases? I don't have them when using the powa_remote_mode.yml compose file. I thought that it was only avaiable with the demo workload image.

I'm just using some modified compose file. It's simply the powa_remote_mode.yml on which I added the 3 containers used for the dev demo compose file. Apart from the extra containers there are no modifications

pgiraud · 2024-07-27T15:20:32Z

Could you share the modified compose file just in case it helps reproducing?

rjuju · 2024-07-27T15:23:13Z

$ diff ../powa_demo.yml compose/powa_remote_mode.yml                                                                                                                                                                                                                                                                                                                                                    [0] 27/07/2024 23:21:45 [AC/DC]
70,112d69
< 
<   pgbench-std-primary:
<     image: powateam/powa-pgbench
<     container_name: powa-dev-pgbench-std-primary
<     restart: on-failure
<     environment:
<       PGHOST: 'remote-primary'
<       PGUSER: 'postgres'
<       PGPORT: 5433
<       BENCH_SCALE_FACTOR: 10
<       BENCH_TIME: 60
<       BENCH_FLAG: '-c1 -j1 -n -R 10'
<     depends_on:
<       remote-primary:
<         condition: service_healthy
< 
<   pgbench-std-standby:
<     image: powateam/powa-pgbench
<     container_name: powa-dev-pgbench-std-standby
<     restart: on-failure
<     environment:
<       PGHOST: 'remote-standby'
<       PGUSER: 'postgres'
<       PGPORT: 5434
<       BENCH_SKIP_INIT: 'true'
<       BENCH_SCALE_FACTOR: 10
<       BENCH_TIME: 120
<       BENCH_FLAG: '-c2 -j2 -S -n -R 10'
<     depends_on:
<       remote-standby:
<         condition: service_healthy
< 
<   pgdemoworload-std-primary:
<     image: powateam/powa-demoworkload
<     container_name: powa-dev-demoworkload-std-primary
<     restart: on-failure
<     environment:
<       PGHOST: 'remote-primary'
<       PGUSER: 'postgres'
<       PGPORT: 5433
<     depends_on:
<       remote-primary:
<         condition: service_healthy

pgiraud · 2024-07-27T15:30:04Z

OK, I could reproduce without the need of the extra containers. I have an idea (possibly wrong) about what's happening.

pgiraud · 2024-07-27T16:57:44Z

In my case, with the default compose file, the collector for the primary server is stopped after the first compose up command (starting from a clean podman environment).

At this point, the overview page for the primary server (http://localhost:8888/server/1/overview/) doesn't show anything (empty components) and the grid for the databases is kind of broken: there are more colum groups than columns.

In the web console, the json response for this page looks like:

                {
                    "server": "1",
                    "title": "Details for all databases",
                    "metrics": [
                        "by_database.calls",
                        "by_database.runtime",
                        "by_database.avg_runtime",
                        "by_database.shared_blks_read",
                        "by_database.shared_blks_hit",
                        "by_database.shared_blks_dirtied",
                        "by_database.shared_blks_written",
                        "by_database.temp_blks_read",
                        "by_database.temp_blks_written",
                        "by_database.io_time"
                    ],
                    "columns": [
                        {
                            "name": "datname",
                            "label": "Database",
                            "url_attr": "url"
                        }
                    ],
                    "toprow": [
                        {},
                        {},
                        {
                            "name": "Execution",
                            "colspan": 3
                        },
                        {
                            "name": "Blocks",
                            "colspan": 4
                        },
                        {
                            "name": "Temp blocks",
                            "colspan": 2
                        },
                        {
                            "name": "I/O"
                        },
                        {
                            "name": "WAL",
                            "colspan": 3
                        },
                        {
                            "name": "JIT",
                            "colspan": 2
                        }
                    ],
                    "type": "grid"
                }

The problem seems to be that the toprow column groups don't match the columns (metrics). We actually don't take the metrics removal into account when we compute the toprow.

On the UI part, there might be a problem too. If the collector is manually reloaded (using the Actions menu on the top bar), the grid layout stays broken until the user reloads the whole page (F5 or reload button in the browser) or navigates in the pages (for example go back to list of servers and then choose primary server again).

Can you confirm that refreshing the page or navigating in PoWA fixes the glitch?

For the record, when the compose file is just run, before the collector is reloaded, no version is available for the extensions even though they are marked as avaiable, installed and sampled.

The versions are shown right after the collector reload.

pgiraud · 2024-07-27T17:10:58Z

In database.py we have code that distinguishes different versions of pgss for the toprow. In server.py, we don't.

rjuju · 2024-07-27T23:06:56Z

Can you confirm that refreshing the page or navigating in PoWA fixes the glitch?

It doesn't. What works is indeed to manually reload the coordinator and then refreshing the page.

I guess that when podman fetches the powa-web image, it's done before starting the collector which means that by the time the collector stops the remote servers are already up and running. I'm also unsure why the depends_on conditions are not applied.

Anyway, good catch! I thought that everything should have been working since the snapshots are happening. I guess that there is also a bug in the collector, I could teach the collector to update the version are the first successful snapshot or something like that, on top of fixing server.py.

rjuju · 2024-07-28T00:35:34Z

I fixed the toprow headers in powa-team/powa-web@c8d480c. The code for that in database.py was also broken since the planning time metric has been added, but went unnoticed as it didn't mess up the grid too much.

pgiraud · 2024-07-29T05:42:51Z

Is there anything left to be done?

rjuju · 2024-07-29T07:17:09Z

I think it's all good now. I will double check just in case. thanks for the timely investigation!

rjuju assigned pgiraud Jul 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Glitch in database detail grid with the powa-web-git image #14

Glitch in database detail grid with the powa-web-git image #14

rjuju commented Jul 27, 2024

pgiraud commented Jul 27, 2024

rjuju commented Jul 27, 2024

pgiraud commented Jul 27, 2024

rjuju commented Jul 27, 2024

pgiraud commented Jul 27, 2024

pgiraud commented Jul 27, 2024 •

edited

Loading

pgiraud commented Jul 27, 2024

rjuju commented Jul 27, 2024

rjuju commented Jul 28, 2024

pgiraud commented Jul 29, 2024

rjuju commented Jul 29, 2024

Glitch in database detail grid with the powa-web-git image #14

Glitch in database detail grid with the powa-web-git image #14

Comments

rjuju commented Jul 27, 2024

pgiraud commented Jul 27, 2024

rjuju commented Jul 27, 2024

pgiraud commented Jul 27, 2024

rjuju commented Jul 27, 2024

pgiraud commented Jul 27, 2024

pgiraud commented Jul 27, 2024 • edited Loading

pgiraud commented Jul 27, 2024

rjuju commented Jul 27, 2024

rjuju commented Jul 28, 2024

pgiraud commented Jul 29, 2024

rjuju commented Jul 29, 2024

pgiraud commented Jul 27, 2024 •

edited

Loading