-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added optional serde support, including an example in the README #59
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this ought to work, but there is the question on the serialization format. How were you planning to use serialization? Just to get idea on common usage.
@@ -70,6 +75,7 @@ use util::*; | |||
/// Note, a `BitSet` is limited by design to only `usize**4` indices. | |||
/// Adding beyond this limit will cause the `BitSet` to panic. | |||
#[derive(Clone, Debug, Default)] | |||
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, deriving should work.
However, it is possible to manually implemented the serialization and deserialization so that they use only layer0
. This would make serialized result use less memory and make serialization faster, but possibly slow deserialization down. When deserializing we have to do operations for all layers in both implementations and IO is usually the more expensive part, so the custom implementation would most likely be faster in general.
Choosing which to use should be chosen before publishing new version as changing serialization format is a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memory usage difference is that with the derive we use O(n log_{bits in usize} n)
and with custom derive we use O(n)
memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to benchmark the difference in speed if we want to have the custom implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is totally up to your team, no doubt a custom serializer would be more compact and faster. In my use case I have a bitset of 100K bits, the size is not as compact as I'd like, about the same as a test with fixedbitset (12K). If there is a way to get it much smaller I would be interested, and if you can guide me on that I may have time to implement it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So how the data structure works is that the layer0
is essentially fixedbitset
.
So if we want to access n:th bit we find which usize
in the Vec
it belongs to and then mask the right bit out of it.
How hibitset
differs is that it has layers above it. The idea is that if there is a 1
in on upper layer at some point, then layer below contains 1
's in corresponding usize
. This means that if n
:th bit is 1
in layer0
then in layer1
there will be 1
in floor(n/64)
position and 1
in floor(n/(62^2))
position of layer2
and so on.
What all this means is that in serialization we can just serialize layer0
as is (which means that the serialized form is pretty much the same as in fixedbitset
). However, when deserializing, the layers above have to be reconstructed. Most naive way to do this would be just to create empty hibitset
with correct size with with_capacity
and then insert bits one by one. Less naive way would be to create hibitset
with deserialized vector on layer0
and, empty, but right sized, layers above and then go through layer1
checking if corresponding layer0
's usize
is equal to 0
or not, repeating until on highest layer.
If you want to try your hand on implementing it, I can help you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WaDelma Here is a simple implementation of the naive way to do this, now if you can give me some pointers I'll try the layer building.
// Small test to serialize layer0 structure and deserialize by setting each bit
let set3 = &mut BitSet::new();
set3.add(3);
set3.add(9);
set3.add(72);
let set3_layer0 = &mut set3.layer0_as_slice();
// Create a new set to deserialize into
let result_set3 = &mut BitSet::with_capacity(set3_layer0.len() as u32);
let mask = 1u64;
println!("mask: {:#066b}", mask);
for i in 0..set3_layer0.len() {
let u = set3_layer0[i];
println!("hibitset: set3: u: {:#066b}", u);
for j in 0..std::mem::size_of::<usize>() * 8 {
let bit = if ((u as u64) >> j) & (mask as u64) > 0 { true } else { false };
println!("bit pos: {} bit: {}", i, bit);
if bit {
result_set3.add((j as u32) + (i as u32 * 64));
}
}
}
for elem in result_set3.iter() {
println!("hibitset: elem: {}", elem);
}
assert_eq!(result_set3, set3);
assert!(result_set3.contains(9u32));
I'm sure your method will be much faster so I'd like to try it with some guidance, but this gives me a test to start with that functions properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WaDelma I ran some tests, and using serde
and bitpacking
the size is only slightly larger than the custom serialization. It was about 12.7K using serde
and the custom serialization was just over 12K. It could be because my bitset was not very dense, so maybe it will make a bigger difference if we use your more advanced method. Let me know what you think, I'll wait until you can provide some direction on how to do the layer reconstruction upon deserialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cisaacson, Did you try and see how much fixedbitset
uses memory? Just to see if that reduction looks good or not.
It could be possible to do some tricks to make sparse/dense bitmaps to take less space, but those might be format dependent and will make some bitsets use more memory, so not sure about them.
As for updating upper layers:
Let b := bits in usize
If your the length of layer0
is n
then the length of layer1
has to be m = ceil(n/b)
This means that for layer1
you can loop from 0
to non-inclusive m
.
For each iteration j
we need to go through all bits of the corresponding usize
.
For each bit i
we check if layer0[(j * b) + i ] != 0
to see if the current bit should be 1
or not.
This same works with pretty much all layers.
(Was away, so got to answer this only now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WaDelma Thanks that makes sense.
I did some more work and research on this, I don't expect there will be much savings for reasonable size BitSets that are sparse over the serde
method. Just layer0
for 100K bits is about 12.5K, which is what you would expect, I don't have exact stats any more but serde
was about 12.7K. For my use case serde
will work for sure.
I also did some homework on sparse bitsets and how to compress them. This is a challenging area (Roaring has done a tremendous amount of work on this). My use case is definitely random and sparse, which makes it virtually impossible to get good compression. Your library is better than Roaring in terms of space by about 25%, because once you have a random sparse BitSet things like run-length encoding only slow it down and take more space. I'm very happy with Hibitset, we will definitely be using it. It's ideal for our use case, but serializing to a smaller size does not seem too feasible right now.
Building the layers as you state would be faster using the layer0 serialization, but again I don't know if it is worth it over using the serde
implementation.
Let me know what you think, I am good for now with what you have provided and the enhancements I proposed for serde
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, its maybe better to start with serde generated implementation and then maybe optimize it later if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
bors r+ Sorry for leaving this hanging. |
bors r+ lets try again :D |
I seem to still be on list of people bors responds to. Is it dead or what has happened? |
Thanks, I hope it is useful to others. Let me know if you need me to do anything else. |
Here is the PR for the
serde
feature. I added a simple example usingbincode
in the README. There is no test, I could do that easily but it would mean adding another dev dependency, not sure if you wanted that in the codebase.The version has not been changed, that should be done if you decide this to put it in a release.