-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop BoltDB from write-cache #3091
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3091 +/- ##
==========================================
- Coverage 22.40% 22.23% -0.18%
==========================================
Files 763 762 -1
Lines 58632 58276 -356
==========================================
- Hits 13136 12956 -180
+ Misses 44603 44436 -167
+ Partials 893 884 -9 ☔ View full report in Codecov by Sentry. |
22d9414
to
9957225
Compare
9957225
to
b9784dd
Compare
An error is currently being returned after migration. Can you check which errors may appear, specifically I am concerned when there are no default bucket or other errors inside the database |
// defaultFlushInterval is default time interval between successive flushes. | ||
defaultFlushInterval = time.Second | ||
defaultFlushInterval = 10 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change should be reflected in documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did't find where in the documentation it is specified about the flush interval for the write-cache. It's not set up anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that the workers were only for parallel processing from the BoltDB?
TBH, I do not understand what is the key difference b/w db and FSTree if we are talking about flushing. I think it is possible to speed up flushing if there are 2+ workers. We do spend some time on object reading from WC cache and its unmarshaling. @roman-khimov
Also, we store raw objects on FS now, I think closing this issue should be done with more precise WC size handling:
neofs-node/pkg/local_object_storage/writecache/state.go
Lines 11 to 13 in 9bc61c5
func (c *cache) estimateCacheSize() uint64 { | |
return c.objCounters.DB()*c.smallObjectSize + c.objCounters.FS()*c.maxObjectSize | |
} |
@roman-khimov
storagelog.Write(c.log, | ||
storagelog.AddressField(saddr), | ||
storagelog.StorageTypeField(wcStorageType), | ||
storagelog.OpField("db DELETE"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also can drop "fstree" in every WC operation log since it is now clear that it was dropped from FSTREE
// Write-cache has 2 components: | ||
// 1. Key-value (bbolt) database for storing small objects. | ||
// 2. Filesystem tree for storing big objects. | ||
// Write-cache uses filesystem tree for storing objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"file system" (separate)?
c.modeMtx.Unlock() | ||
|
||
// Migration part | ||
if !readOnly { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we prevent !
and just invert branches?
return nil | ||
} | ||
|
||
func (c *cache) migrate() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have some simple migrate test?
CHANGELOG.md
Outdated
|
||
### Updated | ||
|
||
### Updating from v0.44.2 | ||
`small_object_size`, `workers_number`, `max_batch_size` and `max_batch_delay` | ||
paramteters are removed from `writecache` config. These parameters are related | ||
to the BoltDB part of the write-cache, which is dropped from the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add, please, a few words that migration will be performed automatically
db, err := writecache.OpenDB(vPath, true) | ||
// openWC opens and returns read-only writecache.Cache located in vPath. | ||
func openWC() (writecache.Cache, error) { | ||
ws := writecache.New(writecache.WithPath(vPath)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ws := writecache.New(writecache.WithPath(vPath)) | |
wc := writecache.New(writecache.WithPath(vPath)) |
} | ||
} | ||
} | ||
|
||
// WithNoSync sets an option to allow returning to caller on PUT before write is persisted. | ||
// Note, that we use this flag for FSTree only and DO NOT use it for a bolt DB because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line should be dropped
Estimations and workers are separate things to be done in separate issues, please open them if they're not here yet. Eventually I'd like wc to keep it's object list in memory (with sizes) and yes, it needs better flushing mechanism. But this doesn't change anything anything here, Bolt is to be removed and it is removed. |
Although there is `ignoreErrors` flag, but when `flushObjects` occurs in FSTree, the error should be returned as with Bolt. Signed-off-by: Andrey Butusov <[email protected]>
Previously, the write-cache has 2 components: bolt database and fstree. BoltDB is absolutely useless for write caching given that the cache uses SSD, so it was decided to drop this database. Drop all code related to BoltDB. Migrate from this database to the main storage, by flushing objects. Remove config parameters for the write-cache: `small_object_size`, `workers_number`, `max_batch_delay`, `max_batch_size`. Update docs. Rewrite lens commands for the current version of the write-cache. Closes #3076. Signed-off-by: Andrey Butusov <[email protected]>
b9784dd
to
3364ab5
Compare
Do not agree. It used to be parallel, it is not now, it is a degradation. It used to have incorrect estimations only because of the existing bbolt base, now we do not have it but still, our expectation about WC size is numOfObjs*theBiggestPossibleObj. I see no better place for these fixes except this PR. |
Bolt-only. FSTree is all the same. |
Closes #3076.
Do I understand correctly that the workers were only for parallel processing from the BoltDB?