pub fn max_subtree_len(input_offset: u64) -> Option<u64>
Expand description
The maximum length of a subtree in bytes, given its starting offset in bytes
If you try to hash more than this many bytes as one subtree, you’ll end up merging parent nodes
that shouldn’t be merged, and your output will be garbage. Hasher::update
will currently
panic in this case, but this is not guaranteed.
For input offset zero (the default), there is no maximum length, and this function returns
None
. For all other offsets it returns Some
. Note that valid offsets must be a multiple of
CHUNK_LEN
(1024); it’s not possible to start hashing a chunk in the middle.
In the example tree below, chunks are numbered by their 0-based index. The subtree that
starts with chunk 3, i.e. input_offset = 3 * CHUNK_LEN
, includes only that one chunk, so
its max length is Some(CHUNK_LEN)
. The subtree that starts with chunk 6 includes chunk 7 but
not chunk 8, so its max length is Some(2 * CHUNK_LEN)
. The subtree that starts with chunk 12
includes chunks 13, 14, and 15, but if the tree were bigger it would not include chunk 16, so
its max length is Some(4 * CHUNK_LEN)
. One way to think about the rule here is that, if you
go beyond the max subtree length from a given starting offset, you start dealing with subtrees
that include chunks to the left of where you started.
root
/ \
. .
/ \ / \
. . . .
/ \ / \ / \ / \
. . . . . . . .
/ \ / \ / \ / \ / \ / \ / \ / \
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
The general rule turns out to be that for a subtree starting at a 0-based chunk index N greater
than zero, the maximum number of chunks in that subtree is the largest power-of-two that
divides N, which is given by 1 << N.trailing_zeros()
.
This function can be useful for writing tests or debug assertions, but it’s actually rare to
use this for real control flow. Callers who split their input recursively using
left_subtree_len
will automatically satisfy the max_subtree_len
bound and don’t
necessarily need to check. It’s also common to choose some fixed power-of-two subtree size, say
64 chunks, and divide your input up into slices of that fixed length (with the final slice
possibly short). This approach also automatically satisfies the max_subtree_len
bound and
doesn’t need to check. Proving that this is true can be an interesting exercise. Note that
chunks 0, 4, 8, and 12 all begin subtrees of at least 4 chunks in the example tree above.
§Panics
This function currently panics if input_offset
is not a multiple of CHUNK_LEN
. This is not
guaranteed.