Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle matrix via reference for grid input #3829

Merged
merged 13 commits into from
Aug 4, 2020
Merged

Conversation

PaulWessel
Copy link
Member

The passing of a matrix masquerading as a grid via GMT_IS_DUPLICATE [the default mode] now seems to work. However passing such grids by reference is also allowed, but I suspect there are cases where it cannot be done (since we would need to do something special to the matrix). Testing is done via testapi_matrix_360_ref.c. Seems to work (please verify if this works from PyGMT, @seisman) but raises an issue related to memory freeing. Since the matrix data is only pointed to by the grid, I set the grid alloc_mode to external so both of them do not try to free the memory. However, that also means that the x and y arrays that were created when creating the grid is left at the end. We could just ignore this rare memory leak or do something about it. One way is to add info to the struct GMT_GRID_HEADER_HIDDEN so that we can free those arrays even those the matrix is left untouched.

I don't know if @joa-quim is wrapping the gmt_hidden.h file in Julia and so adding a new variable will cause trouble.

Testing is done via testapi_matrix_360_ref.c  Seems to work but raises an issue related to memory freeing.  Since the matrix data is only pointed to by the grid, I set the grid alloc_mode to external.  However, that also means that the x and y arrays that are created when creating the grid is left unfreed at the end.  We could just ignore this rare memory leak or do something about it.  One way is to add info to the struct HIDDEN so that we free those arrays even those the matrix is left untouched.
@PaulWessel PaulWessel added bug Something isn't working backport 6.1 Backport this PR to 6.1 branch labels Aug 3, 2020
@PaulWessel PaulWessel requested review from joa-quim and seisman August 3, 2020 20:03
@PaulWessel PaulWessel self-assigned this Aug 3, 2020
@seisman
Copy link
Member

seisman commented Aug 3, 2020

I can confirm that GenericMappingTools/pygmt#515 now works for GMT_IN, GMT_IN|GMT_IS_REFERENCE, and GMT_IN|GMT_IS_DUPLICATE. But this PR breaks two PyGMT tests. The two PyGMT tests call grdcut to extract a subregion from a matrix.

BTW, there is another grdimage bug from PyGMT (GenericMappingTools/pygmt#390). I thought it's caused by the same reason as GenericMappingTools/pygmt#515. But, this PR works for GenericMappingTools/pygmt#515, not for GenericMappingTools/pygmt#390.

@PaulWessel
Copy link
Member Author

Could you tell me the grdcut arguments? I assume input is a global grid and the subset is not global? If so then I am not sure why it would break since it would enter the S_obj->region if-test which was there all along.
Less clear why your Py issue 390 would fail after this PR. it seems exactly like the case herein, but there must be a difference.

@seisman
Copy link
Member

seisman commented Aug 3, 2020

Could you tell me the grdcut arguments? I assume input is a global grid and the subset is not global? If so then I am not sure why it would break since it would enter the S_obj->region if-test which was there all along.

Here is the core PyGMT code. It reads the 01d pixel-registered earth relief global grid into an xarray, and extract the subregion of the grid.

grid = load_earth_relief(registration="pixel")
grdcut(grid, outgrid=tmpfile.name, region="0/180/0/90")

@seisman
Copy link
Member

seisman commented Aug 3, 2020

Less clear why your Py issue 390 would fail after this PR. it seems exactly like the case herein, but there must be a difference.

PyGMT issue 390 fails since GMT 6.0, and is worse (crashes or black images) in GMT 6.1, and this PR doesn't fix it.

@PaulWessel
Copy link
Member Author

I see the report for 390 about zmax < zmin. This is puzzling since when we just request header info in grdimage, for a matrix, it scans the entire matrix for min/max to be sure it is set.
I don't think we can keep writing C clones to test every possible case in PyGMT. We need to learn how to run these python scripts and step into the C code in Xcode or whatever debugger.

@PaulWessel
Copy link
Member Author

And the grdcut case uses reference or duplicate?

@seisman
Copy link
Member

seisman commented Aug 3, 2020

I see the report for 390 about zmax < zmin. This is puzzling since when we just request header info in grdimage, for a matrix, it scans the entire matrix for min/max to be sure it is set.

It usually means zmin=zmax=0.0.

I don't think we can keep writing C clones to test every possible case in PyGMT. We need to learn how to run these python scripts and step into the C code in Xcode or whatever debugger.

I already know how to debug PyGMT and C code in Xcode. It's just like what you do with Matlab: run a python console, attach the process id, and run PyGMT codes in the Python console and xcode will stop at the breakpoint. But I don't know where to debug, since the C API is too complicated to understand.

@seisman
Copy link
Member

seisman commented Aug 3, 2020

And the grdcut case uses reference or duplicate?

Currently, PyGMT always uses GMT_IN|GMT_IS_REFERENCE.

Both GMT_IN and GMT_IN|GMT_IS_DUPLICATE work, but sometimes they also crash.

@PaulWessel
Copy link
Member Author

OK, let me give it a try then. I can check out the pyGMT master on macOS. Can you give me a quick guide for how to install and select my particular GMT build?

@seisman
Copy link
Member

seisman commented Aug 3, 2020

First of all, you need to install PyGMT dependencies and PyGMT itself. PyGMT will tries to find the system GMT library from the LD_LIBRARY_PATH.

For debugging purpose, you can just set the environmental variable GMT_LIBRARY_PATH, like

export GMT_LIBRARY_PATH=~/Gits/gmt/gmt/build/xcode/src/Debug

then PyGMT will find the GMT library in this path first.

In the Python console, you can run following commands to verify your GMT library:

>>> import pygmt
>>> pygmt.show_versions()
PyGMT information:
  version: v0.1.2+30.gbe75223e
System information:
  python: 3.7.6 (default, Jan  8 2020, 13:42:34)  [Clang 4.0.1 (tags/RELEASE_401/final)]
  executable: /Users/seisman/.anaconda/bin/python
  machine: Darwin-19.6.0-x86_64-i386-64bit
Dependency information:
  numpy: 1.18.1
  pandas: 1.0.1
  xarray: 0.15.1
  netCDF4: 1.5.3
  packaging: 20.1
  ghostscript: 9.50
  gmt: 6.2.0_27ceb6e_2020.08.03
GMT library information:
  binary dir: /Users/seisman/.anaconda/bin
  cores: 8
  grid layout: rows
  library path: /Users/seisman/local/GMT/lib/libgmt.dylib
  padding: 2
  plugin dir: /Users/seisman/local/GMT/lib/gmt/plugins
  share dir: /Users/seisman/local/GMT/share
  version: 6.2.0

@PaulWessel
Copy link
Member Author

I am looking for even more basic info. I dont see anything on the pyGMT site of installing anything. I think I need to install conda which I have not used. Is there an INSTALL page or similar I can just follow? Sorry to be such a newbie.

@seisman
Copy link
Member

seisman commented Aug 3, 2020

PyGMT has an installation page. I think you can follow the instruction, but when installing dependencies:

conda create --name pygmt python=3.8 pip numpy pandas xarray netcdf4 packaging gmt

You can leave the "gmt", because you don't need the gmt package from conda.

@PaulWessel
Copy link
Member Author

OK, thanks. Do I first install the full Anaconda or is miniconda OK?

@seisman
Copy link
Member

seisman commented Aug 3, 2020

OK, thanks. Do I first install the full Anaconda or is miniconda OK?

miniconda is OK.

@PaulWessel
Copy link
Member Author

OK, microconda installed, dependencies installed. How does python find pygmt on import? Must be some path setting since I get errors if i just run python and enter the import command.

@seisman
Copy link
Member

seisman commented Aug 3, 2020

OK, microconda installed, dependencies installed. How does python find pygmt on import? Must be some path setting since I get errors if i just run python and enter the import command.

Run

cd pygmt
make install

@PaulWessel
Copy link
Member Author

(base) pwessel@macnut:~/GMTdev/pygmt-> make install
pip install --no-deps -e .
Obtaining file:///Users/pwessel/GMTdev/pygmt
Installing collected packages: pygmt
  Running setup.py develop for pygmt
Successfully installed pygmt
(base) pwessel@macnut:~/GMTdev/pygmt-> python
Python 3.8.3 (default, May 19 2020, 13:54:14) 
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pygmt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/pwessel/GMTdev/pygmt/pygmt/__init__.py", line 15, in <module>
    from .session_management import begin as _begin, end as _end
  File "/Users/pwessel/GMTdev/pygmt/pygmt/session_management.py", line 4, in <module>
    from .clib import Session
  File "/Users/pwessel/GMTdev/pygmt/pygmt/clib/__init__.py", line 8, in <module>
    from .session import Session
  File "/Users/pwessel/GMTdev/pygmt/pygmt/clib/session.py", line 10, in <module>
    from packaging.version import Version
ModuleNotFoundError: No module named 'packaging'

@seisman
Copy link
Member

seisman commented Aug 3, 2020

(base) pwessel@macnut:~/GMTdev/pygmt-> make install

It seems you're using the base (the default) conda environment. If you follow the PyGMT install instruction, you should have created a new environment called "pygmt".

Just run conda activate pygmt to switch to the newly created enviroment and run make install again.

@PaulWessel
Copy link
Member Author

Sorry, got side-tracked. OK, verified it found the xcode-built lib. Trying to run a test by copy paste from test_grdcut.py. COUld import os, numby but not pytest which is not found.

@seisman
Copy link
Member

seisman commented Aug 3, 2020

Those tests are designed to be run by pytest. You can use this one instead:

import pygmt

from pygmt.datasets import load_earth_relief

grid = load_earth_relief(registration="pixel")
pygmt.grdcut(grid, outgrid="out.grd", region="0/180/0/90")

@PaulWessel
Copy link
Member Author

Thanks for your help, up and running in Xcode now.

@joa-quim
Copy link
Member

joa-quim commented Aug 3, 2020

grid = load_earth_relief(registration="pixel")
pygmt.grdcut(grid, outgrid="out.grd", region="0/180/0/90")

Does this mean one have to load the full grid first and only than extract a sub-region?

Tried to reproduce this on Win but it looks my pygmt doesn't work anymore

>>> pygmt.show_versions()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'pygmt' has no attribute 'show_versions

Note, I refuse to install condas to make py things work (and it worked already)

@seisman
Copy link
Member

seisman commented Aug 3, 2020

grid = load_earth_relief(registration="pixel")
pygmt.grdcut(grid, outgrid="out.grd", region="0/180/0/90")

Does this mean one have to load the full grid first and only than extract a sub-region?

For this example, it's yes. But load_earth_relief can also accept the region (i.e., -R) argument, which calls gmt grdcut @earth_relief_xxy -Rw/e/s/n and returns a subregion.

Tried to reproduce this on Win but it looks my pygmt doesn't work anymore

>>> pygmt.show_versions()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'pygmt' has no attribute 'show_versions

Perhaps you're using an old pygmt? pygmt.show_versions() was added in PyGMT 0.1.2, which was released a month ago.

Note, I refuse to install condas to make py things work (and it worked already)

Yes, conda is not necessary for PyGMT.

@PaulWessel
Copy link
Member Author

I've made some additional revisions to this part of the code, especially when we grab the entire matrix. I also applied what we have learned for the matrix duplicate case to the grid duplicate case which I dont think we are using but it was clearly wrong, just like the matrix initially was clearly wrong. Let me know if there is any changes from this on the python side. The tests pass on command line.

@seisman
Copy link
Member

seisman commented Aug 4, 2020

I think your changes have fixed most issues with matrix.

Here is one which doesn't make sense to me:

import pygmt
from pygmt.datasets import load_earth_relief

grid = load_earth_relief()
fig = pygmt.Figure()
fig.grdimage(grid, region='g', projection='Q0/0/6i', cmap='geo', frame='a90f30g30')
fig.grdimage(grid, region='g', projection='Q180/0/6i', cmap='geo', X='7i', frame='a90f30g30')
fig.savefig("earth.png")

The figures are the same, but the longitudes of the right figure are wrong:

image

@PaulWessel
Copy link
Member Author

Will have a look. I just added another update but unlikely to affect the case above.

@PaulWessel
Copy link
Member Author

And this is with REFERENCE in session,py?

@seisman
Copy link
Member

seisman commented Aug 4, 2020

REFERENCE and DUPLICATE give the same wrong results

@PaulWessel
Copy link
Member Author

Soemthing else I need to set?

fig.grdimage(grid, region='g', projection='Q0/0/6i', cmap='geo', frame='a90f30g30')
grdimage [ERROR]: File /Users/pwessel/opt/miniconda3/envs/pygmt/share/geo.cpt was not found

@seisman
Copy link
Member

seisman commented Aug 4, 2020

Yes, when debugging, it seems GMT incorrectly determines the GMT_SHAREDIR, so you need to change GMT_SHAREDIR to the correct GMT/share directory.

@PaulWessel
Copy link
Member Author

The 180 plot does not work because in this case there is no need for a projection but I cannot rotate the grid without projecting. So need to find a way to set need_to_project to true when the grid is by reference.

@PaulWessel
Copy link
Member Author

Your example works for now now. Let me know if this branch can be merged and I will remove the WIP.

@seisman
Copy link
Member

seisman commented Aug 4, 2020

Find one more crash with GMT_IN|GMT_IS_REFERENCE, but it works for GMT_IN or GMT_IN|GMT_IS_DUPLICATE.

The script reads the file @tut_relief.nc into xarray and pass it to grdimage:

import pygmt
import xarray as xr

pygmt.show_versions()

fig = pygmt.Figure()

fname = pygmt.which("@tut_relief.nc", download='a')
dem = xr.open_dataarray(fname)

fig.grdimage(dem, frame='af', projection='M6i', cmap='geo', shading=True, V='d')

fig.savefig("shading.png")

@PaulWessel
Copy link
Member Author

Unable to reproduce any crash - plot looks good. Intermittent?

@seisman
Copy link
Member

seisman commented Aug 4, 2020

It always gives me a crash.

Anyway, I think this PR already fixes the PyGMT issues 390 and 515, and this PR is already very large.

I think we can merge this PR first. Later I can provide a detailed bug report for the crash above.

@PaulWessel
Copy link
Member Author

Yes, I was going to suggest something like that as well. Better to focus on a specific - this PR solves many things but possibly not all.

@PaulWessel PaulWessel changed the title WIP Handle matrix via reference for grid input Handle matrix via reference for grid input Aug 4, 2020
@PaulWessel PaulWessel merged commit 4c18338 into master Aug 4, 2020
@PaulWessel PaulWessel deleted the ref-matrix-grid branch August 4, 2020 19:32
github-actions bot pushed a commit that referenced this pull request Aug 4, 2020
* Handle matrix vai reference for grid input

Testing is done via testapi_matrix_360_ref.c  Seems to work but raises an issue related to memory freeing.  Since the matrix data is only pointed to by the grid, I set the grid alloc_mode to external.  However, that also means that the x and y arrays that are created when creating the grid is left unfreed at the end.  We could just ignore this rare memory leak or do something about it.  One way is to add info to the struct HIDDEN so that we free those arrays even those the matrix is left untouched.

* Add PS orig

* Update gmt_api.c

* Update gmt_grdio.c

* Fix output ij

* Handle a switch from reference to duplicate

* make sure we set min/max when needed and not reset it

* use lessons learned from MATRIX/DUPLICATE to GRID/DUPLICATE

Lots of errors there as well and probalby never accessed.

* Use duplcate for 360 adn ref for 360_ref

Also fix the region assignment

* Apply same logic for finding min/max as for duplicate

* handle non-projection in grdimage when a referenced global grid
PaulWessel added a commit that referenced this pull request Aug 4, 2020
* Handle matrix vai reference for grid input

Testing is done via testapi_matrix_360_ref.c  Seems to work but raises an issue related to memory freeing.  Since the matrix data is only pointed to by the grid, I set the grid alloc_mode to external.  However, that also means that the x and y arrays that are created when creating the grid is left unfreed at the end.  We could just ignore this rare memory leak or do something about it.  One way is to add info to the struct HIDDEN so that we free those arrays even those the matrix is left untouched.

* Add PS orig

* Update gmt_api.c

* Update gmt_grdio.c

* Fix output ij

* Handle a switch from reference to duplicate

* make sure we set min/max when needed and not reset it

* use lessons learned from MATRIX/DUPLICATE to GRID/DUPLICATE

Lots of errors there as well and probalby never accessed.

* Use duplcate for 360 adn ref for 360_ref

Also fix the region assignment

* Apply same logic for finding min/max as for duplicate

* handle non-projection in grdimage when a referenced global grid

Co-authored-by: Paul Wessel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 6.1 Backport this PR to 6.1 branch bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants