Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deferred Writes #155

Merged
merged 3 commits into from
Sep 8, 2017
Merged

Deferred Writes #155

merged 3 commits into from
Sep 8, 2017

Conversation

TylerADavis
Copy link
Collaborator

fixes #151 . Passes new test written for it as well as cpplint.

The design is as follows:

  • New argument added to constructor for cache manager. By default writes are not deferred, but they can be specified to be deferred.
  • If writes are deferred, the item is not put to the remote store until evicted. This is achieved via an asynchronous write, and a map keeps track of the future.
  • When a read is performed, the cache manager checks the map to see if there is an ongoing asynchronous operation for that id. If so, it waits for it to complete before the read can proceed.
  • Completed operation are cleared from the map either if an eviction attempts to increase the size of the map over 50, or when a get() is made on an id that was evicted.

@TylerADavis TylerADavis requested a review from jcarreira August 21, 2017 23:26
@jcarreira
Copy link
Owner

In this code objects are flushed to the remote store when an object is evicted.

I think there are two issues with this approach:

  1. It can take an arbitrarily long amount of time for a change to be seen in the remote object store

It would be nice to have a bound on this time (e.g., something like 1 or 10 ms) so that distributed applications have some guarantee

  1. There can be an arbitrarily high number of 'dirty' writes pending and lying around.

A solution could be to have a thread that every Xms resolves the pending writes. What do you think?

@TylerADavis
Copy link
Collaborator Author

That definitely makes sense.
I agree, a thread that sleeps for 10ms and then checks for dirty entries, which it then pushes to the remote store, would be a good approach. The main difficulty I could see here is how to keep the cache from being modified while dirty entries are being written to the remote store.

@TylerADavis
Copy link
Collaborator Author

One approach may be as follows:

Whenever an item is updated in the local cache, it is marked as dirty, and its object ID placed in a lock free queue of ObjectIDs. The processing thread then pops the first ObjectID, and checks for it in the map (we can convert to a cuckoo hash map). In the event the object is not found (in the case of an eviction), nothing is done and the id is ignored. In the event that the item is found, the cache entry is marked as clean and the item is pushed to the store asynchronously. When items are pushed asynchronously, they have their futures put in an unordered map, much the same as the existing setup save it will be protected by a lock.

When items are evicted, we check to see if they are dirty or not. If they are dirty, the evicting thread pushes the item asynchronously to the stores.

As I mentioned in #100 , there is a possibility that the conversion of the map to a cuckoo map may introduce extra copying, but I am not certain of that.

@TylerADavis
Copy link
Collaborator Author

I'll go ahead and implement this first just placing a lock around the currently existing cache, and then we can later convert it to a cuckoohashmap

@jcarreira jcarreira merged commit 6705edf into master Sep 8, 2017
@jcarreira jcarreira deleted the deferred_writes branch November 16, 2017 06:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add deferred cache writes option
2 participants