Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better documentation for the use of proj_trans_get_last_used_operation() #4391

Open
michaeladamkatz opened this issue Jan 30, 2025 · 6 comments

Comments

@michaeladamkatz
Copy link

I'm using proj 9.5.1.

When reading https://proj.org/en/stable/development/migration.html, I don't think it's made clear enough how important it is to understand and use proj_trans_get_last_used_operation().

I have a geotiff file. I open it as a GDALDataset with GDALOpen(), call GDALGetProjectionRef() to get its SRS (which is below), and create a projection for it with GetProjectionFromWKT().

When I use this projection with proj_trans(), my code is orders of magnitude slower than what it was with with pj_transform() in proj 4.

I stepped into the proj_trans() code, and what I see is that huge amounts of time are spent evaluating which of the P->alternativeCoordinateOperations gives the best result. In my case there are 54 alternativeCoordinateOperations (I don't know how the set is determined and populated).

What I see is that proj_trans() does this comparison/evaluation for every coordinate. I may be missing something, but from what I can see, the choice of which operation to use is not dependent on the coordinate, so I don't understand why it's doing that comparison/evaluation more than once. I see it does store the best-found index in the PJ structure, but from what I can see it doesn't use that cached value. So a question is why's it doing that?

It seems like the purpose of the proj_trans_get_last_used_operation() function is to circumvent the evaluation by directly using the previously selected operation (which further adds to the mystery of why it does the evaluation each time, since now you can bypass that evaluation for the remaining points). Thus I have code like this:

PJ *myProjection;

...

void MyProject( double *x, double *y )
{
	static PJ *lastUsedProjection = NULL;
	PJ_COORD coord;
	coord.xy.x = *x;
	coord.xy.y = *y;
	
	if ( lastUsedProjection == NULL )
	{
		coord = proj_trans( myProjection, PJ_INV, coord );
		lastUsedProjection = proj_trans_get_last_used_operation( myProjection );
	}
	else
	{
		coord = proj_trans( lastUsedProjection, PJ_INV, coord );
	}
	
	*x = coord.xy.x;
	*y = coord.xy.y;
}

If that's the idea of how to use proj_trans_get_last_used_operation(), I think an example like that should be shown prominently in the migration guide. I note that the pyproj introduces a Transformer concept, which I believe is a wrapping around this pattern.

It probably doesn't matter for the purpose of this post but here is the SRS/WKT of my file:

PROJCS["NAD83 / Rhode Island (ftUS)",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101004,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6269"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4269"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",41.0833333333333],PARAMETER["central_meridian",-71.5],PARAMETER["scale_factor",0.99999375],PARAMETER["false_easting",328083.3333],PARAMETER["false_northing",0],UNIT["US survey foot",0.304800609601219,AUTHORITY["EPSG","9003"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","3438"]]

@rouault
Copy link
Member

rouault commented Jan 30, 2025

What I see is that proj_trans() does this comparison/evaluation for every coordinate. I may be missing something, but from what I can see, the choice of which operation to use is not dependent on the coordinate,

That's quite the opposite. We wouldn't just test which operation to use just for the sake of slowing down users :-) For a given source_crs, target_crs tuples there are different alternatives to consider sometimes, each with its own area of use, hence to get the most accurate transformation one has to test if the point to transform fits into the area of use.

If you were transforming between NAD83 (geographic) and WGS84 and there are indeed one transformation per US state as candidates to test, hence the 53.
When trying with your 'PROJCS["NAD83 / Rhode Island (ftUS)", ...' there are only 3 as listed by projinfo -s "NAD83 / Rhode Island (ftUS)" -t WGS84 --spatial-test intersects

@michaeladamkatz
Copy link
Author

Okay, thanks for the explanation.

So the takeaway is that I think it would be very helpful to have a prominent statement/warning on the migration page about how the v 5/6 proj_trans() is often/usually/sometimes orders of magnitude slower than the old pj_transform(), and that, especially if you don't need better than meter level accuracy, you can get around this (a) by using the pattern shown above of using the main projection once and from then on using the projection/operation returned by proj_trans_get_last_used_operation(), or (b) by sticking with the old pj_transform() via ACCEPT_USE_OF_DEPRECATED_PROJ_API_H.

@rouault
Copy link
Member

rouault commented Jan 30, 2025

and that, especially if you don't need better than meter level accuracy, you can get around this (a) by using the pattern shown above of using the main projection once

I wouldn't recommend that unless you are certain all your points fit in the area of use of that transformation. Otherwise that could fail.

or (b) by sticking with the old pj_transform() via ACCEPT_USE_OF_DEPRECATED_PROJ_API_H.

pj_transform has been removed for long (PROJ 7 I believe)

A better option for people seeking maximum performance is that people use projinfo and fetch a pipeline they find reasonable for their use case and provide it directly to proj_create().

@michaeladamkatz
Copy link
Author

Okay, my point remains that many might be surprised by the inefficiency of proj_trans() and could use some prominently placed information about that, and about the recommended ways of dealing with it, with examples. In fact I could use an example right now of how to fetch a reasonable pipeline for proj_create().

@jjimenezshaw
Copy link
Contributor

@michaeladamkatz it is not that this method is slower. The whole library was refactored some years ago to produce more accurate results, and honour transformations in epsg. That was a big change.
That implied that some things are slower? Yes. This happened years ago. It is not that a single function, with the same name is now slower. Everything is different.
However we tried to keep the maximum backward compatibility as possible.

@michaeladamkatz
Copy link
Author

@jjimenezshaw, I appreciate that. I don't claim to know what the most common use cases are for the proj library, but my thought has been that proj_trans() is likely at the heart of what many/most people do with it. If I were just coming to proj 9.5.1 now, without having used old versions in the past, I would just conclude that projection is a slow, complicated process, and my hope of converting 64k of points in real time in a map display program was just not realistic. All I'm trying to say with this post is that that seems like a real gotcha for users like me, when there are ways to to transform those 64k points orders of magnitude faster, especially when the slow-down is about optimizing for a level of accuracy I don't care about.

Maybe I missed it, but it would be very helpful to have a prominent section of the documentation called "Efficiency Concerns" covering the main ways people may not realize they can get much faster code with a minimal sacrifice. If that covers topics beyond proj_trans(), all the better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants