-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[multiresolutionimageinterface] Speed up patch loading #251
Comments
There are a couple of things you can try:
1. Use multiprocessing and get patches from several images at once.
2. Sample all patches once and write them to disk in a fast format for your
DL library of choice (e.g. TFRecords for TensorFlow)
3. Try to prevent reading across tile boundaries, the underlying TIFF files
are tiled. If you request a region that is the same size as the tilesize,
but starts at the center point of the tile, you will need to read 4 tiles
to construct the requested tile. This is not always possible and depends on
your use-case of course.
Op do 24 nov. 2022 om 14:22 schreef pmod ***@***.***>:
… Hi,
I am attempting to write a as-fast-as-possible (tensorflow/python)
dataloader for WSI patches. I looked in the issues for keywords like
"fast", "speed", "accelerate", but did not find any best practices.
This is what i have tried for CAMELYON 16 dataset. Maybe the
maintainers/community can provide some insights?
# Import ASAP lib first!import syssys.path.append('C:\\Program Files\\ASAP 2.1\\bin')import multiresolutionimageinterface as mirreader = mir.MultiResolutionImageReader()
# Step 1 - Loop over random anchor points "pre-selected" from whole-slides-images
# res = {patient_key1: KEY_POINTS: [[x1,y1], [x2,y2], ....]}patch_width = ...patch_height = ...patient_level = ...
for patient_key in res:
path_img = ...
path_mask = ...
wsi_img = reader.open(str(path_img))
wsi_mask = reader.open(str(path_mask))
ds_factor = wsi_mask.getLevelDownsample(patient_level)
# Step 2 - Loop over points for a particular patient
for point in res[patient_key][KEY_POINTS]:
wsi_patch_mask = np.array(wsi_mask.getUCharPatch(point[0]) * ds_factor, point[1] * ds_factor, patch_width, patch_height, patient_level))
wsi_patch_img = np.array(wsi_img.getUCharPatch( point[0]) * ds_factor, point[1] * ds_factor, patch_width, patch_height, patient_level))
yield(wsi_patch_img, wsi_patch_mask)
Full code can be found here
<https://gist.github.com/prerakmody/9237b618c804ca9b99c1fd21e30de496>
My concern is that since I am loading so many patches from the same
patient (with some randomization). And then once a fixed set of patches N
is loaded from a patient, I move on to the next patient. Is it not possible
to speed the patch loading for a patient? Or should I load the whole image
at once, but that may lead to memory overflow?
—
Reply to this email directly, view it on GitHub
<#251>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJIFUEWJW23V5PCKQMJNE3WJ5TYRANCNFSM6AAAAAASKMV7P4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks for the suggestion! Below is a histogram for 2000 patch accesses using |
Hi,
I am attempting to write a as-fast-as-possible (tensorflow/python) dataloader for WSI patches. I looked in the issues for keywords like "fast", "speed", "accelerate", but did not find any best practices.
This is what i have tried for CAMELYON 16 dataset. Maybe the maintainers/community can provide some insights?
Full code can be found here
My concern is that since I am loading so many patches from the same patient (with some randomization). And then once a fixed set of patches N is loaded from a patient, I move on to the next patient. Is it not possible to speed the patch loading for a patient? Or should I load the whole image at once, but that may lead to memory overflow?
The text was updated successfully, but these errors were encountered: