Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grid changes when resampling after clip #12

Open
hannahker opened this issue Oct 31, 2024 · 2 comments
Open

Grid changes when resampling after clip #12

hannahker opened this issue Oct 31, 2024 · 2 comments

Comments

@hannahker
Copy link
Collaborator

After a bunch of testing, I've noticed that some pcodes have slightly different raster stats outputs based on whether resampling to a 0.05 degree grid is done before or after clipping to a given bounding box.

So for example:

da_1 = da_seas5.sel(x=slice(minx, maxx), y=slice(maxy, miny)).persist()
da_1 = upsample_raster(da_1)

print(da_1.rio.transform())
print(da_1.rio.resolution())

# Returns
# | 0.05, 0.00, 33.00|
# | 0.00,-0.05, 15.00|
# | 0.00, 0.00, 1.00|
# (0.050000000000000024, -0.050000000000000024)

vs

da_2 = upsample_raster(da_seas5)
da_2 = da_2.sel(x=slice(minx, maxx), y=slice(maxy, miny)).persist()

print(da_2.rio.transform())
print(da_2.rio.resolution())

# Returns
# | 0.05, 0.00, 32.97|
# | 0.00,-0.05, 14.83|
# | 0.00, 0.00, 1.00|
# (0.05002757249118728, -0.05001598322000091)

We can see the impact of this when we take a look at the rasterized admin boundaries from the two separate grids:

Screen.Recording.2024-10-31.at.3.50.05.PM.mov
@hannahker
Copy link
Collaborator Author

Claude says:

This is a common issue in raster processing that occurs due to how resampling and clipping operations handle pixel alignment and boundaries. Let me explain what's happening and suggest solutions.

import xarray as xr
import rioxarray

def align_and_resample(
    da: xr.DataArray,
    target_resolution: float,
    bounds: tuple,
    align_pixels: bool = True
) -> xr.DataArray:
    """
    Properly aligns and resamples a DataArray while maintaining pixel alignment.
    
    Parameters
    ----------
    da : xarray.DataArray
        Input DataArray with spatial coordinates
    target_resolution : float
        Desired resolution in units of the input data's CRS
    bounds : tuple
        (minx, miny, maxx, maxy) in the input data's CRS
    align_pixels : bool, default True
        If True, ensures pixels are aligned to the target resolution grid
    
    Returns
    -------
    xarray.DataArray
        Resampled and aligned DataArray
    """
    # Extract bounds
    minx, miny, maxx, maxy = bounds
    
    if align_pixels:
        # Align bounds to target resolution grid
        minx = round(minx / target_resolution) * target_resolution
        maxx = round(maxx / target_resolution) * target_resolution
        miny = round(miny / target_resolution) * target_resolution
        maxy = round(maxy / target_resolution) * target_resolution
    
    # Calculate number of pixels
    nx = int(round((maxx - minx) / target_resolution))
    ny = int(round((maxy - miny) / target_resolution))
    
    # Create target coordinates
    new_x = np.linspace(minx + target_resolution/2, 
                       maxx - target_resolution/2, 
                       nx)
    new_y = np.linspace(maxy - target_resolution/2,
                       miny + target_resolution/2,
                       ny)
    
    # Create template DataArray with desired grid
    template = xr.DataArray(
        np.zeros((ny, nx)),
        coords={
            'y': new_y,
            'x': new_x
        },
        dims=['y', 'x']
    )
    
    # Add CRS if present in original
    if hasattr(da, 'rio'):
        template.rio.write_crs(da.rio.crs, inplace=True)
    
    # Reproject input data to match template
    result = da.rio.reproject_match(template)
    
    return result

The difference you're seeing occurs because:

  1. When you clip first (da_1), the resampling operation works with the exact boundary coordinates you specified, creating a regular grid starting from those points.

  2. When you resample first (da_2), the resampling operation creates a new grid based on the original data's extent, and then the clipping operation has to snap to the nearest pixels in this new grid, which can lead to slightly different alignments.

To solve this, I've provided a function that:

  1. Aligns bounds to the target resolution grid
  2. Creates a template grid with exact pixel centers
  3. Uses reproject_match to ensure consistent results

You can use it like this:

# Define your target resolution and bounds
target_resolution = 0.05
bounds = (minx, miny, maxx, maxy)

# Process your data
result = align_and_resample(da_seas5, target_resolution, bounds)

This approach will give you consistent results regardless of the order of operations. The key benefits are:

  1. Pixel centers are properly aligned to the target resolution grid
  2. Boundaries are adjusted to ensure whole pixels
  3. The output resolution will be exactly what you specified

@hannahker
Copy link
Collaborator Author

Looks like @zackarno found the solution here! #13 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant