-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-11784. Add missing parent directories for abort MPU and abort expired MPU requests #7566
Conversation
This is usually expected to mean "CI completed successfully with these changes". However, in this case, as you have noticed, some tests are failing: https://github.com/sokui/ozone/actions/runs/12286612359/job/34288154427#step:6:759 Unfortunately CI does not provide logs, because individual test cases after the failing one keep timing out, eventually reaching overall CI timeout. I'll make a note to improve that. You can run the same test locally. First, reduce test cases to minimum:
I tried this locally, it seems OM terminates when aborting unknown MPU. Can you please check? |
expired abort request
I fixed the test error. But the new CI run has another test failure org.apache.hadoop.fs.ozone.TestHSync . However, when I run this test locally, it has no issues. I am not sure if this test is already flaky. See the CI run on my repo here |
Thanks @sokui for fixing the test, will take a look. Don't worry about |
@sokui Please try to avoid force-push when updating the PR. Here are some great articles that explain why: https://developers.mattermost.com/blog/submitting-great-prs/#4-avoid-force-pushing |
Sure thing. I was try to cherry pick my change as one commit and put it above 1.4.1, and use it as our internal build for testing. That is why I do the force push. I will not force push for my future commits. Thanks for reminder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sokui for the patch.
protected void addMissingParentsToCache(OmBucketInfo omBucketInfo, | ||
List<OmDirectoryInfo> missingParentInfos, | ||
OMMetadataManager omMetadataManager, | ||
long volumeId, | ||
long bucketId, | ||
long transactionLogIndex) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not format method signature like this. Whenever visibility / return type / method name / other modifiers are changed, we would have to reindent all parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any code style format file that I can import to my IntelliJ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/intellij/ozone-style.xml
I'm not sure if it has anything about this particular preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
protected void addMissingParentsToCache(OmBucketInfo omBucketInfo, | ||
List<OmDirectoryInfo> missingParentInfos, | ||
OMMetadataManager omMetadataManager, | ||
long volumeId, long bucketId, long transactionLogIndex | ||
) throws IOException { | ||
// FSO is disabled. Do nothing. | ||
} | ||
|
||
protected void addMultiPartToCache( | ||
OMMetadataManager omMetadataManager, String multipartOpenKey, | ||
OMFileRequest.OMPathInfoWithFSO pathInfoFSO, OmKeyInfo omKeyInfo, | ||
String keyName, long transactionLogIndex | ||
) throws IOException { | ||
// FSO is disabled. Do nothing. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This used to do nothing for non-FSO, and was overridden to add the missing parents only for FSO. I may be missing something, but it seems the patch does not distinguish between these two cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed. Seems for all the S3 request, they pair with a FSO request. My new change only apply this logic to FSO requests. There is one request which does not pair with FSO is the S3ExpiredMultipartUploadsAbortRequest
. I applied logic to this request as well since my issue is this request.
String multipartOpenKey = omMetadataManager | ||
.getMultipartKey(volumeId, bucketId, | ||
pathInfoFSO.getLastKnownParentId(), | ||
pathInfoFSO.getLeafNodeName(), | ||
keyArgs.getMultipartUploadID()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Please indent continuations with 4 spaces, not 8. (Applies to the patch as a whole, this is only an example.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reformatted
import java.io.IOException; | ||
import java.nio.file.InvalidPathException; | ||
import java.nio.file.Paths; | ||
import java.util.ArrayList; | ||
import java.util.Collections; | ||
import java.util.List; | ||
import java.util.Map; | ||
import java.util.concurrent.atomic.AtomicReference; | ||
import java.util.function.BiFunction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not reorder imports unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is auto formatted by my IDE. Let me try to reverse this order change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted
@adoroszlai pls have another look. Thank you! |
@SaketaChalamchala Could you help to take a look as well since it's related to the logic in HDDS-10630? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sokui Thanks for working on this. Just an initial review as I'm trying to understand the related logic.
Also I noticed in #6496 that the DIRECTORY_TABLE
is not added in the @CleanupTableInfo
in the related OM response. Please help to add the DIRECTORY_TABLE
for the OM responses which add the missing directories to the directory table.
Could you also help to add some tests? I'm not entirely sure in what conditions will missing parent directories might happen.
if (bucketInfo == null) { | ||
throw new IOException("bucketInfo is null"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this? validateBucketAndVolume
will check the bucket existence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do hit this condition: String bucketKey
is not null, but 'CacheValue bucketInfo' is null. Pls let me know if my validation here is reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, validateBucketAndVolume
is already checking the cache as well. Moreover, the OM request was done under a BUCKET_LOCK, so the bucket should still exists during the request processing. Bucket not found check should be earlier before updating any OM state. Moreover, it should be an OMException
thrown with BUCKET_NOT_FOUND
ResultCodes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I think I saw this condition happened and that is why I added this block later on. But let me delete it and re-test
@@ -259,6 +289,8 @@ private void updateTableCache(OzoneManager ozoneManager, | |||
|
|||
String multipartOpenKey; | |||
try { | |||
KeyArgs keyArgs = buildKeyArgs(multipartUpload); | |||
addOrGetMissingDirectories(ozoneManager, keyArgs, trxnLogIndex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have a corresponding OM response to add the missing directories to the DB batch similar to #6496?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know your thoughts on my reply related to #6496
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the DIRECTORY_TABLE
to @CleanupTableInfo
for S3ExpiredMultipartUploadsAbortResponse
, S3MultipartUploadAbortResponseWithFSO
, and S3MultipartUploadCompleteResponseWithFSO
now
Hi @ivandika3 , Correct me if I'm wrong. Here, we only add the missing directories to the cache of the directory table (since we need to delete this key), not the real directory table. In this case, do we need to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong. Here, we only add the missing directories to the cache of the directory table (since we need to delete this key), not the real directory table. In this case, do we need to add DIRECTORY_TABLE to
@CleanupTableInfo
?
At least in #6496, since the OM response will add the missing directories to the real directory table, @CleanupTableInfo
should be added to clean the OM directory cache entries since they are already persisted in OM DB.
As for abort and expired abort MPU. I think it's OK to just add to the dir cache since the directories are just to avoid the exceptions. However, IMO we should also add the @CleanupTableInfo
for the corresponding OM response so that the cache entries will not be in the OM dir cache forever.
That said, AFAIK this is the only special case where OM request update to cache does not translate to persisting the entry in the OM DB. Hence, I'm not entire sure what might be the possible impact. For example, one case I can think of is that if the subsequent key is created using that missing directories created by this patch, but later when the cache is cleaned up or OM is restarted, the cached directories disappear and the newly created key will be orphaned.
if (bucketInfo == null) { | ||
throw new IOException("bucketInfo is null"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, validateBucketAndVolume
is already checking the cache as well. Moreover, the OM request was done under a BUCKET_LOCK, so the bucket should still exists during the request processing. Bucket not found check should be earlier before updating any OM state. Moreover, it should be an OMException
thrown with BUCKET_NOT_FOUND
ResultCodes
.
checkBucketQuotaInNamespace(omBucketInfo, missingParentInfos.size()); | ||
omBucketInfo.incrUsedNamespace(missingParentInfos.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, if we are only adding a directories in cache which will be cleaned up afterwards (for abort and expired abort), I don't think we should increased the usedNamespace of the parent bucket since this will cause usedNamespace
to never be 0 since the correspodning decrUsedNamespace
will not be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is copied from https://github.com/apache/ozone/pull/6496/files#diff-28870b5ea5ae6d1f8207b99595dc84f585becd8f2c53ee1d852d6ae34b7229f4R86-R87. Pls let me know if this logic is needed in MPU complete request. So my essential question is should we increase the used namespace at initiate time, or the complete time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the #7566 (comment) for another approach where we don't need to create missing parent directories, and therefore no need to update the bucket namespace.
if (!addMissingDirectoriesToCacheEnabled()) { | ||
return missingParentInfos; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I know what is the purpose of this? Are there any OM requests that called addOrGetMissingDirectories
but will not actually add the missing directories to the cache?
If there is, please add a corresponding log. If there isn't, I think we can just remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to differentiate the behaviors between FSO request and non-FSO request. As @adoroszlai stated here: #7566 (comment), the PR should be only related to FSO requests.
You see in this PR: by default, this addMissingDirectoriesToCacheEnabled()
will return false. But for FSO request, it overrides it as true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok understood, unlike in S3InitiateMultipartUploadRequestWithFSO
, validateAndUpdateCache
for S3MultipartUploadCompleteRequestWithFSO
and S3MultipartUploadAbortRequestWithFSO
are implemented by the parent instead, requiring the distinction with the flag.
Hi @ivandika3 , to be honest, I am not sure what the root cause. As stated in #6496, the files (S3InitiateMultipartUploadRequestWithFSO.java, S3InitiateMultipartUploadResponseWithFSO.java) contains this logic.I can understand we are missing the parent directories initially, but why we missing these directories in the complete, abort and exprire phases? Is it possible that between adding the open key to the table and adding the missing parent directories, ozone (om) got interrupted, leaving the open key exists but parent directories not there? |
@sokui Ok, I was thinking a bit more about this.
I can think one possible case, when between the initiate and complete / abort / expire, the parent directory is deleted and all the child directories and files are garbage collected by For example, there is a directory "/a" and there is a pending MPU key with path "/a/mpu_key" that was initiated but haven't been completed / aborted yet. After the MPU key was initiated, the directory "/a" is deleted, and since mpu_key has not been completed yet and does not exist in fileTable, the Now I can understand that initiate and complete MPU will need to add the missing parent directories, since the directories will be accessed again. However, for abort (expired) MPUs, I don't think #6496 applies since we don't really want to add the missing parent directories for a key that is going to be deleted anyway. Even if we add the parent directories, adding it in cache only might result in unpredictable behavior where the missing directories created can suddenly disappear since they are not persisted in the OM DB. So, I believe either we should persist the missing parent directories or we should not create them in the first place. I think the original intention is that we want to ensure the MPU abort will succeed regardless whether the parent exists or not. I can think of one way, how about we first get the In the future, for preventive measures, we can try to take into account incomplete MPUs for directory deletions. For example:
Note: I'm not too well-versed about FSO logic, so I could be wrong. |
I am testing your idea. Will keep you updated. |
@ivandika3 verified. Your idea works. I will modify my PR accordingly. |
I created a new PR with the new idea here. Pls review that PR. Will close this one if the other one is good. |
Thanks @sokui for continuing work on this. I think it's OK to completely update PR based on comments if a better approach is found. Creating new one splits discussion between the two PRs. Now that the new PR is open, let's close this one. |
I understand your consideration. I have two considerations if I continue working on the same PR: first the other approach is very simple, if I continue work on this PR, I need to revert each of the comment which looks weird. Second, I am not sure at that time point which direction should go. Thats why I create a new PR without touching this one. In case we still favor this one, I can go back without any impact. To let people know our conversation, I will link this PR in the other PR. |
What changes were proposed in this pull request?
HDDS-11784 adding missing parent directories for MPU abort and expired abort request
We observed lots of open key (files) in our FSO enabled ozone cluster. And these are all incomplete MPU keys.
When I tried to abort MPU by using s3 cli as below, I got the exception complaining about the parent directory is not found.
Exceptions in the log
This issue is similar as the issue HDDS-10630. This PR brings the same mechanism which adds the missing parent directories to the table cache before we abort the MPU. The PR adds this mechanism to both
S3ExpiredMultipartUploadsAbortRequest
andS3ExpiredMultipartUploadsAbortRequest
. Also the code is refactored so multiple places reuse the same mechanismWhat is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11784
How was this patch tested?
It tested by the CI. Also we validate it in our cluster (on going).