minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	b912c8f035	fix: generate new version when replacing metadata in CopyObject (#9871 )	5 years ago
Harshavardhana	e79874f58e	[feat] Preserve version supplied by client (#9854 ) Just like GET/DELETE APIs it is possible to preserve client supplied versionId's, of course the versionIds have to be uuid, if an existing versionId is found it is overwritten if no object locking policies are found. - PUT /bucketname/objectname?versionId=<id> - POST /bucketname/objectname?uploads=&versionId=<id> - PUT /bucketname/objectname?verisonId=<id> (with x-amz-copy-source)	5 years ago
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	5 years ago
Harshavardhana	ff94b1b0a9	isEndpointConnected should take local disk inputs (#9803 ) PR #9801 while it is correct, the loop isEndpointConnected() was changed to rely on endpoint.String() which has the host information as well, which is not correct value as input to detect if the disk is down or up, if endpoint is local use its local path value instead.	5 years ago
Harshavardhana	62b1da3e2c	fix offline disk calculation (#9801 ) Current code was relying on globalEndpoints as the source of secondary truth to obtain the missing endpoints list when the disk is offline, this is problematic - there is no way to know if the getDisks() returned endpoints total is same as the ones list of globalEndpoints and it belongs to a particular set. - there is no order guarantee as getDisks() is ordered as per format.json, globalEndpoints may not be, so potentially end up including incorrect endpoints. To fix this bring getEndpoints() just like getDisks() to ensure that consistently ordered endpoints are always available for us to ensure that returned values are consistent with what each erasure set would observe.	5 years ago
Harshavardhana	342ade03f6	deprecate listDir usage for healing (#9792 ) listDir was incorrectly used for healing which is slower, instead use Walk() to heal the entire set.	5 years ago
Harshavardhana	41688a936b	fix: CopyObject behavior on expanded zones (#9729 ) CopyObject was not correctly figuring out the correct destination object location and would end up creating duplicate objects on two different zones, reproduced by doing encryption based key rotation.	5 years ago
Harshavardhana	b2db8123ec	Preserve errors returned by diskInfo to detect disk errors (#9727 ) This PR basically reverts #9720 and re-implements it differently	5 years ago
Harshavardhana	b330c2c57e	Introduce simpler GetMultipartInfo call for performance (#9722 ) Advantages avoids 100's of stats which are needed for each upload operation in FS/NAS gateway mode when uploading a large multipart object, dramatically increases performance for multipart uploads by avoiding recursive calls. For other gateway's simplifies the approach since azure, gcs, hdfs gateway's don't capture any specific metadata during upload which needs handler validation for encryption/compression. Erasure coding was already optimized, additionally just avoids small allocations of large data structure. Fixes #7206	5 years ago
Klaus Post	95814359bd	cache disk info to avoid repeated calls (#9682 ) This value is requested on every upload when there are multiple zones. Since this will result in an RPC call to every remote disk this scales quite badly in a distributed setup. Load every 1second interval. 2 servers, localhost only. In large distributed setups much bigger gains can be expected. ``` Operations: 21743 -> 22454 * Average: +3.28% (+0.0 MiB/s) throughput, +3.28% (+11.9) obj/s * Fastest: +3.37% (+0.0 MiB/s) throughput, +3.37% (+13.0) obj/s * 50% Median: +3.03% (+0.0 MiB/s) throughput, +3.03% (+11.2) obj/s * Slowest: +8.03% (+0.0 MiB/s) throughput, +8.03% (+22.8) obj/s ``` For easy management of this a generic helper has been added.	5 years ago
Krishna Srinivas	7d19ab9f62	readiness returns error quickly if any of the set is down (#9662 ) This PR adds a new configuration parameter which allows readiness check to respond within 10secs, this can be reduced to a lower value if necessary using ``` mc admin config set api ready_deadline=5s ``` or ``` export MINIO_API_READY_DEADLINE=5s ```	5 years ago
P R	3f6d624c7b	add gateway object tagging support (#9124 )	5 years ago
Anis Elleuch	9baeda781a	fix storage info output with unordered endpoints arguments (#9610 ) Shuffling arguments that we pass to MinIO server are supported. However, when that happens, Prometheus returns wrong information about disks usage and online/offline status. The commit fixes the issue by avoiding relying on xl.endpoints since it is not ordered.	5 years ago
Harshavardhana	bd032d13ff	migrate all bucket metadata into a single file (#9586 ) this is a major overhaul by migrating off all bucket metadata related configs into a single object '.metadata.bin' this allows us for faster bootups across 1000's of buckets and as well as keeps the code simple enough for future work and additions. Additionally also fixes #9396, #9394	5 years ago
Harshavardhana	6ac48a65cb	fix: use unused cacheMetrics code in prometheus (#9588 ) remove all other unusued/deadcode	5 years ago
Anis Elleuch	c045ae15e7	fix: avoid undoing bucket creation and return the first err instead (#9578 )	5 years ago
Harshavardhana	a1de9cec58	cleanup object-lock/bucket tagging for gateways (#9548 ) This PR is to ensure that we call the relevant object layer APIs for necessary S3 API level functionalities allowing gateway implementations to return proper errors as NotImplemented{} This allows for all our tests in mint to behave appropriately and can be handled appropriately as well.	5 years ago
Harshavardhana	4c9de098b0	heal buckets during init and make sure to wait on quorum (#9526 ) heal buckets properly during expansion, and make sure to wait for the quorum properly such that healing can be retried.	5 years ago
Bala FA	3773874cd3	add bucket tagging support (#9389 ) This patch also simplifies object tagging support	5 years ago
Harshavardhana	c2529260e7	fix: crash observed when position of drives different (#9490 ) allocate the disk slice properly before populating disk by its ID and its position. Fixes #9416	5 years ago
Anis Elleuch	7ad6bc955f	show a notice when mixed rootfs & mounted disks is detected (#9471 ) A user can incorrectly mounts a newly fresh disk. MinIO will detect that it is writing with a rootfs disk and will mark it down. However, it is hard for the user to understand what's going on. This commit will just print a notice so it will be easy to spot such use case.	5 years ago
Harshavardhana	bc61417284	calculate automatic node based symmetry (#9446 ) it is possible in many screnarios that even if the divisible value is optimal, we may end up with uneven distribution due to number of nodes present in the configuration. added code allow for affinity towards various ellipses to figure out optimal value across ellipses such that we can always reach a symmetric value automatically. Fixes #9416	5 years ago
Harshavardhana	97d952e61c	fix: ensure buckets are preserved if one set returns error (#9468 ) the bucket should be deleted if it can be successfully deleted on all sets, if not we should ensure to restore those buckets properly.	5 years ago
Klaus Post	073aac3d92	add data update tracking using bloom filter (#9208 ) By monitoring PUT/DELETE and heal operations it is possible to track changed paths and keep a bloom filter for this data. This can help prioritize paths to scan. The bloom filter can identify paths that have not changed, and the few collisions will only result in a marginal extra workload. This can be implemented on either a bucket+(1 prefix level) with reasonable performance. The bloom filter is set to have a false positive rate at 1% at 1M entries. A bloom table of this size is about ~2500 bytes when serialized. To not force a full scan of all paths that have changed cycle bloom filters would need to be kept, so we guarantee that dirty paths have been scanned within cycle runs. Until cycle bloom filters have been collected all paths are considered dirty.	5 years ago
Klaus Post	f19cbfad5c	fix: use per test context (#9343 ) Instead of GlobalContext use a local context for tests. Most notably this allows stuff created to be shut down when tests using it is done. After PR #9345 9331 CI is often running out of memory/time.	5 years ago
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	5 years ago
Harshavardhana	4714958e99	fix: possible connection leaks in sets init, heal (#9263 )	5 years ago
Bala FA	95e89f1712	proactive deep heal object when a bitrot is detected (#9192 )	5 years ago
Bala FA	2c3e34f001	add force delete option of non-empty bucket (#9166 ) passing HTTP header `x-minio-force-delete: true` would allow standard S3 API DeleteBucket to delete a non-empty bucket forcefully.	5 years ago
Harshavardhana	6f992134a2	fix: startup load time by reusing storageDisks (#9210 )	5 years ago
Krishna Srinivas	ef6304c5c2	Improve connectDisks() performance (#9203 )	5 years ago
Harshavardhana	813e0fc1a8	fix: optimize isConnected to avoid url.String() conversions (#9202 ) Stringifying in a loop can tax the system, avoid this and convert the endpoints to strings early on and remember them for the lifetime of the server.	5 years ago
Krishna Srinivas	45b1c66195	fix: implement splunk specific listObjects when delimiter=guidSplunk (#9186 )	5 years ago
Harshavardhana	da04cb91ce	optimize listObjects to list only from 3 random disks (#9184 )	5 years ago
Harshavardhana	cfc9cfd84a	fix: various optimizations, idiomatic changes (#9179 ) - acquire since leader lock for all background operations - healing, crawling and applying lifecycle policies. - simplify lifecyle to avoid network calls, which was a bug in implementation - we should hold a leader and do everything from there, we have access to entire name space. - make listing, walking not interfere by slowing itself down like the crawler. - effectively use global context everywhere to ensure proper shutdown, in cache, lifecycle, healing - don't read `format.json` for prometheus metrics in StorageInfo() call.	5 years ago
Harshavardhana	d45a1808f2	fix: Walk() should require quorum number of disks only (#9164 )	5 years ago
Anis Elleuch	db2155551a	heal: Pass scan mode to HealObjects to deep scan full quorum objects (#9159 ) As an optimization of the healing, HealObjects() avoid sending an object to the background healing subsystem when the object is present in all disks. However, HealObjects() should have checked the scan type, if this deep, always pass the object to the healing subsystem.	5 years ago
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	5 years ago
Harshavardhana	6a00eb10bf	fix: allow set drive count of proper divisible values (#9101 ) Currently the code assumed some orthogonal requirements which led situations where when we have a setup where we have let's say for example 168 drives, the final set_drive_count chosen was 14. Indeed 168 drives are divisible by 12 but this wasn't allowed due to an unexpected requirement to have 12 to be a perfect modulo of 14 which is not possible. This assumption was incorrect. This PR fixes this old assumption properly, also adds few tests and some negative tests as well. Improvements are seen in error messages as well.	5 years ago
kannappanr	07a7f329e7	xl: Fix counting offline disks in StorageInfo (#9082 ) Recent modification in the code led to incorrect calculation of offline disks. This commit saves the endpoint list in a xlObjects then we know the name of each disk.	5 years ago
Harshavardhana	6f66f1a910	close channel upon error in Walk()'er (#9042 )	5 years ago
Harshavardhana	23a8411732	Add a generic Walk()'er to list a bucket, optinally prefix (#9026 ) This generic Walk() is used by likes of Lifecyle, or KMS to rotate keys or any other functionality which relies on this functionality.	5 years ago
Harshavardhana	ab7d3cd508	fix: Speed up multi-object delete by taking bulk locks (#8974 ) Change distributed locking to allow taking bulk locks across objects, reduces usually 1000 calls to 1. Also allows for situations where multiple clients sends delete requests to objects with following names ``` {1,2,3,4,5} ``` ``` {5,4,3,2,1} ``` will block and ensure that we do not fail the request on each other.	5 years ago
Anis Elleuch	d4dcf1d722	metrics: Use StorageInfo() instead to have consistent info (#9006 ) Metrics used to have its own code to calculate offline disks. StorageInfo() was avoided because it is an expensive operation by sending calls to all nodes. To make metrics & server info share the same code, a new argument `local` is added to StorageInfo() so it will only query local disks when needed. Metrics now calls StorageInfo() as server info handler does but with the local flag set to false. Co-authored-by: Praveen raj Mani <praveen@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	5 years ago
Harshavardhana	d1144c2c7e	reference format obtained doesn't need further validation (#8964 ) we don't need to validateFormats again once we have obtained reference format, because it is possible that at this stage another server is doing a disk heal during startup, once in a while due to delays we get false positives and our server doesn't start. Format in quorum as reference format can be assumed as valid and we proceed further, until and unless HealFormat re-inits the disks after a successful heal. Also use separate port for healing tests to avoid any conflicts with regular build testing. Fixes #8884	5 years ago
Harshavardhana	9ecd66007f	fix: reduce the load on CPU when loading users/policies (#8984 ) Trying to be conservative by slowing ourselves down on a regular basis.	5 years ago
Krishnan Parthasarathi	026265f8f7	Add support for bucket encryption feature (#8890 ) - pkg/bucket/encryption provides support for handling bucket encryption configuration - changes under cmd/ provide support for AES256 algorithm only Co-Authored-By: Poorna <poornas@users.noreply.github.com> Co-authored-by: Harshavardhana <harsha@minio.io>	5 years ago
Klaus Post	9990464cd5	Fix recursive deep scan of buckets (#8900 )	5 years ago
Harshavardhana	f98616dce7	heal: Optimize heal listing by avoiding batches (#8901 ) Also limit the heal per object if there is incoming requests by suspending heal for longer periods of time.	5 years ago
Harshavardhana	0cbebf0f57	Rename pkg/{tagging,lifecycle} to pkg/bucket sub-directory (#8892 ) Rename to allow for more such features to come in a more proper hierarchical manner.	5 years ago

1 2

53 Commits (7b5223d83dd179eee63b6c08eec6d9b9910f6fa2)