minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	c8b84a0e9e	Add nancy vulnerability scanner (#10289 )	4 years ago
Jorge Israel Peña	4752323e1c	Use hdfs.Readdir() to optimize HDFS directory listings (#10121 ) Currently, listing directories on HDFS incurs a per-entry remote Stat() call penalty, the cost of which can really blow up on directories with many entries (+1,000) especially when considered in addition to peripheral calls (such as validation) and the fact that minio is an intermediary to the client (whereas other clients listed below can query HDFS directly). Because listing directories this way is expensive, the Golang HDFS library provides the [`Client.Open()`] function which creates a [`FileReader`] that is able to batch multiple calls together through the [`Readdir()`] function. This is substantially more efficient for very large directories. In one case we were witnessing about +20 seconds to list a directory with 1,500 entries, admittedly large, but the Java hdfs ls utility as well as the HDFS library sample ls utility were much faster. Hadoop HDFS DFS (4.02s): λ ~/code/minio → use-readdir » time hdfs dfs -ls /directory/with/1500/entries/ … hdfs dfs -ls 5.81s user 0.49s system 156% cpu 4.020 total Golang HDFS library (0.47s): λ ~/code/hdfs → master » time ./hdfs ls -lh /directory/with/1500/entries/ … ./hdfs ls -lh 0.13s user 0.14s system 56% cpu 0.478 total mc and minio without optimization (16.96s): λ ~/code/minio → master » time mc ls myhdfs/directory/with/1500/entries/ … ./mc ls 0.22s user 0.29s system 3% cpu 16.968 total mc and minio with optimization (0.40s): λ ~/code/minio → use-readdir » time mc ls myhdfs/directory/with/1500/entries/ … ./mc ls 0.13s user 0.28s system 102% cpu 0.403 total [`Client.Open()`]: https://godoc.org/github.com/colinmarc/hdfs#Client.Open [`FileReader`]: https://godoc.org/github.com/colinmarc/hdfs#FileReader [`Readdir()`]: https://godoc.org/github.com/colinmarc/hdfs#FileReader.Readdir	4 years ago
poornas	c43da3005a	Add support for server side bucket replication (#9882 )	4 years ago
Harshavardhana	ec06089eda	fix: re-implement cluster healthcheck (#10101 )	4 years ago
Harshavardhana	2955aae8e4	feat: Add notification support for bucketCreates and removal (#10075 )	4 years ago
Harshavardhana	14b1c9f8e4	fix: return Range errors after If-Matches (#10045 ) closes #7292	4 years ago
Harshavardhana	4bfc50411c	fix: return versionId in tagging APIs (#10068 )	4 years ago
Harshavardhana	14ff7f5fcf	add hdfs sub-path support (#10046 ) for users who don't have access to HDFS rootPath '/' can optionally specify `minio gateway hdfs hdfs://namenode:8200/path` for which they have access to, allowing all writes to be performed at `/path`. NOTE: once configured in this manner you need to make sure command line is correctly specified, otherwise your data might not be visible closes #10011	4 years ago
Anis Elleuch	778e9c864f	Move dependency from minio-go v6 to v7 (#10042 )	4 years ago
Harshavardhana	e7d7d5232c	fix: admin info output and improve overall performance (#10015 ) - admin info node offline check is now quicker - admin info now doesn't duplicate the code across doing the same checks for disks - rely on StorageInfo to return appropriate errors instead of calling locally. - diskID checks now return proper errors when disk not found v/s format.json missing. - add more disk states for more clarity on the underlying disk errors.	4 years ago
Harshavardhana	3b9fbf80ad	fix: make sure to use new restClient for healthcheck (#10026 ) Without instantiating a new rest client we can have a recursive error which can lead to healthcheck returning always offline, this can prematurely take the servers offline.	4 years ago
kannappanr	efe9fe6124	azure: Return success when deleting non-existent object (#9981 )	4 years ago
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	4 years ago
kannappanr	225b812b5e	Update minio-go library to latest (#9813 )	4 years ago
Harshavardhana	5686a7e273	fix NAS gateway support for policy/notification (#9765 ) Fixes #9764	5 years ago
Anis Elleuch	fd0de4ab32	azure: Show better message when credentials are wrong (#9748 )	5 years ago
Anis Elleuch	bd59f150b8	azure: Implement CopyPart API (#9747 )	5 years ago
Harshavardhana	38ee40d59c	move to upstream code colinmarc/hdfs (#9738 ) - supports SASL based authentication now - upgrades to new changes in gokrb library - implement force delete feature Fixes #8206	5 years ago
kannappanr	d583f1ac0e	check if container is empty before invoking DeleteContainer (#9733 )	5 years ago
Harshavardhana	b2db8123ec	Preserve errors returned by diskInfo to detect disk errors (#9727 ) This PR basically reverts #9720 and re-implements it differently	5 years ago
Harshavardhana	b330c2c57e	Introduce simpler GetMultipartInfo call for performance (#9722 ) Advantages avoids 100's of stats which are needed for each upload operation in FS/NAS gateway mode when uploading a large multipart object, dramatically increases performance for multipart uploads by avoiding recursive calls. For other gateway's simplifies the approach since azure, gcs, hdfs gateway's don't capture any specific metadata during upload which needs handler validation for encryption/compression. Erasure coding was already optimized, additionally just avoids small allocations of large data structure. Fixes #7206	5 years ago
P R	3f6d624c7b	add gateway object tagging support (#9124 )	5 years ago
Harshavardhana	1bc32215b9	enable full linter across the codebase (#9620 ) enable linter using golangci-lint across codebase to run a bunch of linters together, we shall enable new linters as we fix more things the codebase. This PR fixes the first stage of this cleanup.	5 years ago
kannappanr	a62572fb86	Check for address flags in all positions (#9615 ) Fixes #9599	5 years ago
Harshavardhana	d348ec0f6c	avoid double listObjectParts calls improves performance (#9606 ) this PR is to avoid double calls across multiple calls in APIs - CopyObjectPart - PutObjectPart	5 years ago
kannappanr	6c1bbf918d	do not add quotes around etag, if already present (#9603 )	5 years ago
Harshavardhana	a1de9cec58	cleanup object-lock/bucket tagging for gateways (#9548 ) This PR is to ensure that we call the relevant object layer APIs for necessary S3 API level functionalities allowing gateway implementations to return proper errors as NotImplemented{} This allows for all our tests in mint to behave appropriately and can be handled appropriately as well.	5 years ago
poornas	0f1389e992	Fix azure gateway handling of ETag for CopyObject (#9544 ) fixes #9428	5 years ago
Harshavardhana	9dda1fd624	Remove B2 gateway implementation (#9547 ) S3 is now natively supported by B2 cloud storage provider there is no reason to use specialized gateway for B2 anymore, our current S3 gateway with caching would work with B2. Resolves #8584	5 years ago
Boaz	ac5061df2c	fix: make azure gateway chunk size configurable (#9292 )	5 years ago
Harshavardhana	282c9f790a	fix: validate partNumber in queryParam as part of preConditions (#9386 )	5 years ago
Harshavardhana	3ff5bf2369	fix: convert storage class into azure tiers (#9381 )	5 years ago
Harshavardhana	69ee28a082	remove OSS gateway due to lack of licensing (#9390 ) OSS go sdk lacks licensing terms in their repository, and there has been no activity On the issue here https://github.com/aliyun/aliyun-oss-go-sdk/issues/245 This PR is to ensure we remove any dependency code which lacks explicit license file in their repo.	5 years ago
Harshavardhana	69fb68ef0b	fix simplify code to start using context (#9350 )	5 years ago
Harshavardhana	7d636a7c13	enable --compat flag by default (#9326 ) if needed use --no-compat to disable md5sum while verifying any performance numbers. bring back --compat behavior as default to avoid additional documentation and confusing behavior, as we are working towards improving md5sum to be faster on AVX instructions, enabling this should be hardly a problem in future versions of MinIO. fixes #8012 fixes #7859 fixes #7642	5 years ago
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	5 years ago
ebozduman	8dd63a462f	fix: ETag returned by OSS endpoint (#9243 )	5 years ago
Ingmar Runge	fa4d627b57	B2 gateway S3 compat: return MD5 hash as ETag from PutObject (#9183 ) - B2 does actually return an MD5 hash for newly uploaded objects so we can use it to provide better compatibility with S3 client libraries that assume the ETag is the MD5 hash such as boto. - depends on change in blazer library. - new behaviour is only enabled if MinIO's --compat mode is active. - behaviour for multipart uploads is unchanged (works fine as is).	5 years ago
Bala FA	2c3e34f001	add force delete option of non-empty bucket (#9166 ) passing HTTP header `x-minio-force-delete: true` would allow standard S3 API DeleteBucket to delete a non-empty bucket forcefully.	5 years ago
Harshavardhana	3d3beb6a9d	Add response header timeouts (#9170 ) - Add conservative timeouts upto 3 minutes for internode communication - Add aggressive timeouts of 30 seconds for gateway communication Fixes #9105 Fixes #8732 Fixes #8881 Fixes #8376 Fixes #9028	5 years ago
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	5 years ago
Krishna Srinivas	2e9fed1a14	non-empty dirs should not be listed as objects (#9129 )	5 years ago
Harshavardhana	e3b44c3829	Remove partName, partETag requirement (#9044 ) This is a precursor change before versioning, removes/deprecates the requirement of remembering partName and partETag which are not useful after a multipart transaction has finished. This PR reduces the overall size of the backend JSON for large file uploads.	5 years ago
poornas	224b4f13b8	Add cache eviction low and high watermarks (#8958 ) To allow better control the cache eviction process. Introduce MINIO_CACHE_WATERMARK_LOW and MINIO_CACHE_WATERMARK_HIGH env. variables to specify when to stop/start cache eviction process. Deprecate MINIO_CACHE_EXPIRY environment variable. Cache gc sweeps at 30 minute intervals whenever high watermark is reached to clear least recently accessed entries in the cache until sufficient space is cleared to reach the low watermark. Garbage collection uses an adaptive file scoring approach based on last access time, with greater weights assigned to larger objects and those with more hits to find the candidates for eviction. Thanks to @klauspost for this file scoring algorithm Co-authored-by: Klaus Post <klauspost@minio.io>	5 years ago
Anis Elleuch	d4dcf1d722	metrics: Use StorageInfo() instead to have consistent info (#9006 ) Metrics used to have its own code to calculate offline disks. StorageInfo() was avoided because it is an expensive operation by sending calls to all nodes. To make metrics & server info share the same code, a new argument `local` is added to StorageInfo() so it will only query local disks when needed. Metrics now calls StorageInfo() as server info handler does but with the local flag set to false. Co-authored-by: Praveen raj Mani <praveen@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	5 years ago
Nitish Tiwari	63be4709b7	Add metrics support for Azure & GCS Gateway (#8954 ) We added support for caching and S3 related metrics in #8591. As a continuation, it would be helpful to add support for Azure & GCS gateway related metrics as well.	5 years ago
Harshavardhana	0cbebf0f57	Rename pkg/{tagging,lifecycle} to pkg/bucket sub-directory (#8892 ) Rename to allow for more such features to come in a more proper hierarchical manner.	5 years ago
Forest Lovewood	dd93eee1e3	Implement bucket caching for b2 gateway (#8820 ) fixes #8739 #6806	5 years ago
Harshavardhana	09ee145e9c	gw/hdfs: indicate hdfs gateway is production ready (#8848 )	5 years ago
Harshavardhana	fca4ee84c9	gw/hdfs: listing should list directories properly (#8827 ) Fixes #8822	5 years ago

1 2 3 4

197 Commits (81c90ae43013a0616a0cda15f0ef1f4d2fb72bc4)