Skip to content
This repository has been archived by the owner on Dec 11, 2019. It is now read-only.

Fight marks corruption #10

Open
fingolfin opened this issue Dec 4, 2015 · 1 comment
Open

Fight marks corruption #10

fingolfin opened this issue Dec 4, 2015 · 1 comment

Comments

@fingolfin
Copy link
Owner

There is a git-marks-check in this repository, written by felipec, which helps with some cases where the marks file got corrupted.

This helps, but the real problem is that there is corruption in the first place. We should try hard to avoid that. So here are some thoughts I had on possible causes for corruption; please take them with a grain of salt, though, they might be bogus (e.g. due to my lack of understanding what is really going on)

  1. One possible source for corruption: If git-remote-hg is killed while writing a marks file, this might cause corruption. So, perhaps we need to implement an atomic write. I.e. use json.dump write to a new file, then only once that is finished, replace the old marks file with the new one with an atomic rename -- at least where available. See also http://stackoverflow.com/questions/7645338/atomic-file-replacement-in-python https://bugs.python.org/issue8828
  2. Another situation I am wary of: What happens if git purges commits that were created from hg commits? In our marks file, we recorded the correspondence. Next time we import something, couldn't it happen that we want to reference one of those git commits that are no long there? In fact, what does git fast-import even do if we give it marks for commits that do not exist anymore? One a quick glance at the code, it seems as if it might die with an object not found error? That would be quite bad. Let's try to trigger such an situation, and (dis)prove that there is a problem. One idea to go about this:
    1. Create a hg repository with a default branch, plus a separate new branch N containing a single commit C. Make a backup of the repository.
    2. Clone it using git-remote-hg. Do not checkout branch N.
    3. Purge that branch N on the hg side.
    4. Now git fetch --purge; clear the reflog, gc, etc. to make sure that the git commit corresponding to C goes away (it should do that, since no ref would reference it anymore). Of course it would still be in our marks file.
    5. Optionally, reintroduce commit C on the hg side from the backup (this is not necessary)
    6. Fetch the changes.
      I added the optional step 5 to demonstrate that we cannot simply fix this issue by cleaning up the marks file -- we need to remember which hg commit corresponds to which git commit, even if one of the two temporarily is gone (of course we could try to extract that information from git notes, too..)
  3. Other possible causes?
@fingolfin
Copy link
Owner Author

See also felipec#26 (I should migrate that issue here, if I can reliably reproduce it)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant