To avoid large delays in metacache cleanup, use rename
instead of recursive delete calls, renames are cheaper
move the content to minioMetaTmpBucket and then cleanup
this folder once in 24hrs instead.
If the new cache can replace an existing one, we should
let it replace since that is currently being saved anyways,
this avoids pile up of 1000's of metacache entires for
same listing calls that are not necessary to be stored
on disk.
This PR refactors the way we use buffers for O_DIRECT and
to re-use those buffers for messagepack reader writer.
After some extensive benchmarking found that not all objects
have this benefit, and only objects smaller than 64KiB see
this benefit overall.
Benefits are seen from almost all objects from
1KiB - 32KiB
Beyond this no objects see benefit with bulk call approach
as the latency of bytes sent over the wire v/s streaming
content directly from disk negate each other with no
remarkable benefits.
All other optimizations include reuse of msgp.Reader,
msgp.Writer using sync.Pool's for all internode calls.
This PR adds transition support for ILM
to transition data to another MinIO target
represented by a storage class ARN. Subsequent
GET or HEAD for that object will be streamed from
the transition tier. If PostRestoreObject API is
invoked, the transitioned object can be restored for
duration specified to the source cluster.
Delete marker replication is implemented for V2
configuration specified in AWS spec (though AWS
allows it only in the V1 configuration).
This PR also brings in a MinIO only extension of
replicating permanent deletes, i.e. deletes specifying
version id are replicated to target cluster.
Similar to #10775 for fewer memory allocations, since we use
getOnlineDisks() extensively for listing we should optimize it
further.
Additionally, remove all unused walkers from the storage layer
Design: https://gist.github.com/klauspost/025c09b48ed4a1293c917cecfabdf21c
Gist of improvements:
* Cross-server caching and listing will use the same data across servers and requests.
* Lists can be arbitrarily resumed at a constant speed.
* Metadata for all files scanned is stored for streaming retrieval.
* The existing bloom filters controlled by the crawler is used for validating caches.
* Concurrent requests for the same data (or parts of it) will not spawn additional walkers.
* Listing a subdirectory of an existing recursive cache will use the cache.
* All listing operations are fully streamable so the number of objects in a bucket no
longer dictates the amount of memory.
* Listings can be handled by any server within the cluster.
* Caches are cleaned up when out of date or superseded by a more recent one.
This PR adds support for healing older
content i.e from 2yrs, 1yr. Also handles
other situations where our config was
not encrypted yet.
This PR also ensures that our Listing
is consistent and quorum friendly,
such that we don't list partial objects
The S3 specification says that versions are ordered in the response of
list object versions.
mc snapshot needs this to know which version comes first especially when
two versions have the same exact last-modified field.
- Implement a new xl.json 2.0.0 format to support,
this moves the entire marshaling logic to POSIX
layer, top layer always consumes a common FileInfo
construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
for object placements.
Fixes#2111
This is necessary for calculating the total storage
capacity from object layer. This value is also needed for
browser UI.
Buckets used to carry this information, this patch
deprecates this feature.
It is the bucket and volumes which needs to have this
value rather than the DiskInfo API itself. Eventually
this can be extended to show disk usage per
Buckets/Volumes whenever we have that functionality.
For now since buckets/volumes are thinly provisioned
this is the right approach.
- Marker should be escaped outside in handlers.
- Delimiter should be handled outside in handlers.
- Add missing comments and change the function names.
- Handle case of 'maxKeys' when its set to '0', its a valid
case and should be treated as such.
* listObjects: improve response time by not doing stat during readDir() operation.
* listObjects: Add windows support.
* listObjects: Readdir() in batches to conserve memory. Add solaris build.
* listObjects: cleanup code.
- over the course of a project history every maintainer needs to update
its dependency packages, the problem essentially with godep is manipulating
GOPATH - this manipulation leads to static objects created at different locations
which end up conflicting with the overall functionality of golang.
This also leads to broken builds. There is no easier way out of this other than
asking developers to do 'godep restore' all the time. Which perhaps as a practice
doesn't sound like a clean solution. On the other hand 'godep restore' has its own
set of problems.
- govendor is a right tool but a stop gap tool until we wait for golangs official
1.5 version which fixes this vendoring issue once and for all.
- govendor provides consistency in terms of how import paths should be handled unlike
manipulation GOPATH.
This has advantages
- no more compiled objects being referenced in GOPATH and build time GOPATH
manging which leads to conflicts.
- proper import paths referencing the exact package a project is dependent on.
govendor is simple and provides the minimal necessary tooling to achieve this.
For now this is the right solution.
- All test files have been renamed to their respective <package>_test name,
this is done in accordance with
- https://github.com/golang/go/wiki/CodeReviewComments#import-dot
imports are largely used in testing, but to avoid namespace collision
and circular dependencies
- Never use _* in package names other than "_test" change fragment_v1 to expose
fragment just like 'gopkg.in/check.v1'