Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glitch in database detail grid with the powa-web-git image #14

Open
rjuju opened this issue Jul 27, 2024 · 11 comments
Open

Glitch in database detail grid with the powa-web-git image #14

rjuju opened this issue Jul 27, 2024 · 11 comments
Assignees

Comments

@rjuju
Copy link
Member

rjuju commented Jul 27, 2024

@pgiraud I don't know if you can reproduce this problem.

On my side, trying the powa_remote_mode.yml compose file (I just pushed the missing POWA_REMOTE_PORT on this one and another compose file + the use of v3 format) I have a normal looking instance page, e.g.:

db_details_ok

But if I shutdown the compose file and restart it, for some reason the db detail top row get messed up and the last few columns are pushed out as if there was a wrong colspan:

db_details_ko

Just restarting the powa-git-web container is not enough, I need a full "podman-compose up && podman-compose up" for that. And once done, it stays broken no matter what I do, the only way to fix it is to remove the image and fetch it again.

Can you reproduce the problem? If yes, can you fix it?

@pgiraud
Copy link
Member

pgiraud commented Jul 27, 2024

Unfortunately, I'm not able to reproduce the problem.

What I did:

  • prune everything I could (volumes, images, containers, system)
  • ran podman-compose -f compose/powa_remote_mode.yml up (not detached)
  • open the browser for the "primary" server (I had to go to Configuration to reload the collector because it's marked as stopped),
  • waited for a few seconds,
  • pressed Ctrl-C to stop the compose,
  • ran the same command again.

Everything displays as expected.

How did you get the "tpc" and "obvious" databases? I don't have them when using the powa_remote_mode.yml compose file. I thought that it was only avaiable with the demo workload image.

@rjuju
Copy link
Member Author

rjuju commented Jul 27, 2024

I'm not sure what the ctrl-c exactly does. Can you try

podman-compose -f compose/powa_remote_mode.yml up -d
# wait until it's up and got a couple of snapshot
podman-compose -f compose/powa_remote_mode.yml down -t0

podman-compose -f compose/powa_remote_mode.yml up -d

which is what I'm usually using, if that matters.

How did you get the "tpc" and "obvious" databases? I don't have them when using the powa_remote_mode.yml compose file. I thought that it was only avaiable with the demo workload image.

I'm just using some modified compose file. It's simply the powa_remote_mode.yml on which I added the 3 containers used for the dev demo compose file. Apart from the extra containers there are no modifications

@pgiraud
Copy link
Member

pgiraud commented Jul 27, 2024

Could you share the modified compose file just in case it helps reproducing?

@rjuju
Copy link
Member Author

rjuju commented Jul 27, 2024

$ diff ../powa_demo.yml compose/powa_remote_mode.yml                                                                                                                                                                                                                                                                                                                                                    [0] 27/07/2024 23:21:45 [AC/DC]
70,112d69
< 
<   pgbench-std-primary:
<     image: powateam/powa-pgbench
<     container_name: powa-dev-pgbench-std-primary
<     restart: on-failure
<     environment:
<       PGHOST: 'remote-primary'
<       PGUSER: 'postgres'
<       PGPORT: 5433
<       BENCH_SCALE_FACTOR: 10
<       BENCH_TIME: 60
<       BENCH_FLAG: '-c1 -j1 -n -R 10'
<     depends_on:
<       remote-primary:
<         condition: service_healthy
< 
<   pgbench-std-standby:
<     image: powateam/powa-pgbench
<     container_name: powa-dev-pgbench-std-standby
<     restart: on-failure
<     environment:
<       PGHOST: 'remote-standby'
<       PGUSER: 'postgres'
<       PGPORT: 5434
<       BENCH_SKIP_INIT: 'true'
<       BENCH_SCALE_FACTOR: 10
<       BENCH_TIME: 120
<       BENCH_FLAG: '-c2 -j2 -S -n -R 10'
<     depends_on:
<       remote-standby:
<         condition: service_healthy
< 
<   pgdemoworload-std-primary:
<     image: powateam/powa-demoworkload
<     container_name: powa-dev-demoworkload-std-primary
<     restart: on-failure
<     environment:
<       PGHOST: 'remote-primary'
<       PGUSER: 'postgres'
<       PGPORT: 5433
<     depends_on:
<       remote-primary:
<         condition: service_healthy

@pgiraud
Copy link
Member

pgiraud commented Jul 27, 2024

OK, I could reproduce without the need of the extra containers. I have an idea (possibly wrong) about what's happening.

@pgiraud
Copy link
Member

pgiraud commented Jul 27, 2024

In my case, with the default compose file, the collector for the primary server is stopped after the first compose up command (starting from a clean podman environment).

At this point, the overview page for the primary server (http://localhost:8888/server/1/overview/) doesn't show anything (empty components) and the grid for the databases is kind of broken: there are more colum groups than columns.

Screenshot from 2024-07-27 18-25-35

In the web console, the json response for this page looks like:

                {
                    "server": "1",
                    "title": "Details for all databases",
                    "metrics": [
                        "by_database.calls",
                        "by_database.runtime",
                        "by_database.avg_runtime",
                        "by_database.shared_blks_read",
                        "by_database.shared_blks_hit",
                        "by_database.shared_blks_dirtied",
                        "by_database.shared_blks_written",
                        "by_database.temp_blks_read",
                        "by_database.temp_blks_written",
                        "by_database.io_time"
                    ],
                    "columns": [
                        {
                            "name": "datname",
                            "label": "Database",
                            "url_attr": "url"
                        }
                    ],
                    "toprow": [
                        {},
                        {},
                        {
                            "name": "Execution",
                            "colspan": 3
                        },
                        {
                            "name": "Blocks",
                            "colspan": 4
                        },
                        {
                            "name": "Temp blocks",
                            "colspan": 2
                        },
                        {
                            "name": "I/O"
                        },
                        {
                            "name": "WAL",
                            "colspan": 3
                        },
                        {
                            "name": "JIT",
                            "colspan": 2
                        }
                    ],
                    "type": "grid"
                }

The problem seems to be that the toprow column groups don't match the columns (metrics). We actually don't take the metrics removal into account when we compute the toprow.

On the UI part, there might be a problem too. If the collector is manually reloaded (using the Actions menu on the top bar), the grid layout stays broken until the user reloads the whole page (F5 or reload button in the browser) or navigates in the pages (for example go back to list of servers and then choose primary server again).

Can you confirm that refreshing the page or navigating in PoWA fixes the glitch?

For the record, when the compose file is just run, before the collector is reloaded, no version is available for the extensions even though they are marked as avaiable, installed and sampled.

Screenshot from 2024-07-27 18-55-04

The versions are shown right after the collector reload.

Screenshot from 2024-07-27 18-57-56

@pgiraud
Copy link
Member

pgiraud commented Jul 27, 2024

In database.py we have code that distinguishes different versions of pgss for the toprow. In server.py, we don't.

@rjuju
Copy link
Member Author

rjuju commented Jul 27, 2024

Can you confirm that refreshing the page or navigating in PoWA fixes the glitch?

It doesn't. What works is indeed to manually reload the coordinator and then refreshing the page.

I guess that when podman fetches the powa-web image, it's done before starting the collector which means that by the time the collector stops the remote servers are already up and running. I'm also unsure why the depends_on conditions are not applied.

Anyway, good catch! I thought that everything should have been working since the snapshots are happening. I guess that there is also a bug in the collector, I could teach the collector to update the version are the first successful snapshot or something like that, on top of fixing server.py.

@rjuju
Copy link
Member Author

rjuju commented Jul 28, 2024

I fixed the toprow headers in powa-team/powa-web@c8d480c. The code for that in database.py was also broken since the planning time metric has been added, but went unnoticed as it didn't mess up the grid too much.

@pgiraud
Copy link
Member

pgiraud commented Jul 29, 2024

Is there anything left to be done?

@rjuju
Copy link
Member Author

rjuju commented Jul 29, 2024

I think it's all good now. I will double check just in case. thanks for the timely investigation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants