Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization idea for faster lookups #25

Open
rhwilburn opened this issue Jan 14, 2016 · 3 comments
Open

Optimization idea for faster lookups #25

rhwilburn opened this issue Jan 14, 2016 · 3 comments

Comments

@rhwilburn
Copy link

I had an idea today when trying to work in-memory with GTFS that it can make sense to do look ups, as I have seen in the code. The look ups however are likely (and I am yet to test an approach here) inefficient as they use LINQ First() method. I think putting things in a dictionary might make alot of sense here, at least for finding items by Id. Currently my app takes 17 minutes to run, so I am seeing if I can bring that down a bit. I will let you know if I find any key information here.

@rhwilburn
Copy link
Author

I have confirmed doing this has optimized my program down from 17 minutes to 2.5 minutes.

My work around was to do the following:

var stopTimesByTrips = feed.StopTimes.GroupBy(stopTime => stopTime.TripId).ToDictionary(stopTimeGrouping => stopTimeGrouping.Key, stopTimeGrouping => stopTimeGrouping.OrderBy(shape => shape.StopSequence).ToList());
var shapesDictionary = feed.Shapes.GroupBy(shapes => shapes.Id).ToDictionary(shapeGrouping => shapeGrouping.Key, shapeGrouping => shapeGrouping.OrderBy(shape => shape.Sequence).ToList());
var tripsDictionary = feed.Trips.ToDictionary(trip => trip.Id);
var routesDictionary = feed.Routes.ToDictionary(route => route.Id);

Maybe this sort of idea could be provide through extension methods?

@xivk
Copy link
Contributor

xivk commented Jan 14, 2016

I meant the in-memory db just for reference not so much for actual applications but I also built some code using it by now. Maybe it's better to use a database for your app or find some .NET in-memory db implementation we can use? I'm a bit reluctant to implement some of these features because these things can be fixed by using a database.

@rhwilburn
Copy link
Author

I can see the arguments towards putting it a formal database (especially with regards to allowing DBs to cater for efficient indexing; I mean who wants to implement something that competes with the 100s of databases out there unless explicitly aiming to do so) however I can equally see the arguments for efficient code. I think my bias in suggesting this is that lists are being used where dictionaries should be because you are continuously structuring the code in a way that it is doing look ups.

With that said, you could argue that you might need dictionaries for many things other than just ID. You could also argue that people might not be doing lookups very often either. My were lookups generating a 2 gig GraphML database import file from 30mb zip of GTFS data (about a 1.5 million person city). I really want to convert in my use case, not to store in an intermediary persistent format (ie the database) for the one off conversion.

I have a work around so am not bothered as such if you are sure this is not in scope of the project. Feel free to close etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants