The problem in existing code was the following line
```
start := int(keyCrc%uint32(cardinality)) | 1
```
A given a value of N cardinality the ending result
because of the the bitwise '|' would lead to always
higher affinity to odd sequences.
As can be seen from the test cases that this can
lead to many objects being allocated the same set
of disks or atleast the first disk is an odd disk
always. This introduces a performance problem
for majority of the objects under concurrent load.
Remove `| 1` to provide a more cleaner distribution
and the new code will be.
```
start := int(keyCrc % uint32(cardinality))
```
Thanks to Krishna Srinivas for pointing out the bitwise
situation here.
Amazon S3 API expects all incoming stream has a content-length
set it was superflous for us to support object layer which supports
unknown sized stream as well, this PR removes such requirements
and explicitly error out if input stream is less than zero.
We don't need to typecast identifiers from
their base to type to same type again. This
is not a bug and compiler is fine to skip
it but it is better to avoid if not needed.
This change provides new implementations of the XL backend operations:
- create file
- read file
- heal file
Further this change adds table based tests for all three operations.
This affects also the bitrot algorithm integration. Algorithms are now
integrated in an idiomatic way (like crypto.Hash).
Fixes#4696Fixes#4649Fixes#4359
xl.storageDisks is sometimes passed to some low-level XL functions. Some disks in
xl.storageDisks are set to nil when they encounter some errors. This means all
elements in xl.storageDisks will be nil after some time which lead to an unusable XL.
This is an enhancement to the XL/distributed-XL mode. FS mode is
unaffected.
The ReadFileWithVerify storage-layer call is similar to ReadFile with
the additional functionality of performing bit-rot checking. It
accepts additional parameters for a hashing algorithm to use and the
expected hex-encoded hash string.
This patch provides significant performance improvement because:
1. combines the step of reading the file (during
erasure-decoding/reconstruction) with bit-rot verification;
2. limits the number of file-reads; and
3. avoids transferring the file over the network for bit-rot
verification.
ReadFile API is implemented as ReadFileWithVerify with empty hashing
arguments.
Credits to AB and Harsha for the algorithmic improvement.
Fixes#4236.
This PR also does backend format change to 1.0.1
from 1.0.0. Backward compatible changes are still
kept to read the 'md5Sum' key. But all new objects
will be stored with the same details under 'etag'.
Fixes#4312
This PR is for readability cleanup
- getOrderedDisks as shuffleDisks
- getOrderedPartsMetadata as shufflePartsMetadata
Distribution is now a second argument instead being the
primary input argument for brevity.
Also change the usage of type casted int64(0), instead
rely on direct type reference as `var variable int64` everywhere.
Current implementation didn't honor quorum properly and didn't
handle the errors generated properly. This patch addresses that
and also moves common code `cleanupMultipartUploads` into xl
specific private function.
Fixes#3665
This is needed as explained by @krisis
Lets say we have following errors.
```
[]error{nil, errFileNotFound, errDiskAccessDenied, errDiskAccesDenied}
```
Since the last two errors are filtered, the maximum is nil,
depending on map order.
Let's say we get nil from reduceErr. Clearly at this point
we don't have quorum nodes agreeing about the data and since
GetObject only requires N/2 (Read quorum) and isDiskQuorum
would have returned true. This is problematic and can lead to
undersiable consequences.
Fixes#3298
Disks when are offline for a long period of time, we should
ignore the disk after trying Login upto 5 times.
This is to reduce the network chattiness, this also reduces
the overall time spent on `net.Dial`.
Fixes#3286
- Using gjson for constructing xlMetaV1{} in realXLMeta.
- Test for parsing constructing xlMetaV1{} using gjson.
- Changes made since benchmarks showed 30-40% improvement in speed.
- Follow up comments in issue https://github.com/minio/minio/issues/2208
for more details.
- gjson parsing of parts from xl.json for listParts.
- gjson parsing of statInfo from xl.json for getObjectInfo.
- Vendorizing gjson dependency.
* XL: Refactor xl.GetObject and erasureReadFile. erasureReadFile() responsible for just erasure coding, it takes ordered disks and checkSum slice.
* move getOrderedPartsMetadata and getOrderedDisks to xl-v1-utils.go
* Review fixes.
This change is needed to make reading from objects future proof
in-terms of handling online disks. Our current counter is not
based on affirmative knowledge and relies on arithmetic sequence
which can lead to bugs.
Using modTime simplifies the understanding of `xl.json` and future
tooling / debugging of the format.
* xl/selfheal: selfheal based on read quorum on GET
* xl: getReadableDisks() also returns whether self-heal is needed so that this info can be used by ReadFile/SelfHeal/StatFile.
* xl: trigger selfheal from StatFile.