minio

Commit Graph

Author	SHA1	Message	Date
poornas	a8dd7b3eda	Refactor replication target management. (#10154 ) Generalize replication target management so that remote targets for a bucket can be managed with ARNs. `mc admin bucket remote` command will be used to manage targets.	4 years ago
Harshavardhana	fe157166ca	fix: Pass context all the way down to the network call in lockers (#10161 ) Context timeout might race on each other when timeouts are lower i.e when two lock attempts happened very quickly on the same resource and the servers were yet trying to establish quorum. This situation can lead to locks held which wouldn't be unlocked and subsequent lock attempts would fail. This would require a complete server restart. A potential of this issue happening is when server is booting up and we are trying to hold a 'transaction.lock' in quick bursts of timeout.	4 years ago
poornas	c43da3005a	Add support for server side bucket replication (#9882 )	4 years ago
Harshavardhana	11d21d5d1b	fix: pass around the correct drives per set (#10097 ) this is a precursor change before adding parity based SLA across zones instead of same stripe size	4 years ago
Klaus Post	00d3cc4b69	Enforce quota checks after crawl (#10036 ) Enforce bucket quotas when crawling has finished. This ensures that we will not do quota enforcement on old data. Additionally, delete less if we are closer to quota than we thought.	4 years ago
Harshavardhana	37c14207d6	fix: cors handling again for not just OPTIONS request (#10025 ) CORS is notorious requires specific headers to be handled appropriately in request and response, using cors package as part of handlerFunc() for options method lacks the necessary control this package needs to add headers.	4 years ago
Harshavardhana	5c15656c55	support bootstrap client to use healthcheck restClient (#10004 ) - reduce locker timeout for early transaction lock for more eagerness to timeout - reduce leader lock timeout to range from 30sec to 1minute - add additional log message during bootstrap phase	4 years ago
Anis Elleuch	2be20588bf	Reroute requests based token heal/listing (#9939 ) When manual healing is triggered, one node in a cluster will become the authority to heal. mc regularly sends new requests to fetch the status of the ongoing healing process, but a load balancer could land the healing request to a node that is not doing the healing request. This PR will redirect a request to the node based on the node index found described as part of the client token. A similar technique is also used to proxy ListObjectsV2 requests by encoding this information in continuation-token	4 years ago
Krishna Srinivas	4c266df863	fix: proxy ListObjects request to one of the server based on hash(bucket) (#9881 )	4 years ago
Harshavardhana	a38ce29137	fix: simplify background heal and trigger heal items early (#9928 ) Bonus fix during versioning merge one of the PR was missing the offline/online disk count fix from #9801 port it correctly over to the master branch from release. Additionally, add versionID support for MRF Fixes #9910 Fixes #9931	4 years ago
Praveen raj Mani	b1705599e1	Fix config leaks and deprecate file-based config setters in NAS gateway (#9884 ) This PR has the following changes - Removing duplicate lookupConfigs() calls. - Deprecate admin config APIs for NAS gateways. This will avoid repeated reloads of the config from the disk. - WatchConfigNASDisk will be removed - Migration guide for NAS gateways users to migrate to ENV settings. NOTE: THIS PR HAS A BREAKING CHANGE Fixes #9875 Co-authored-by: Harshavardhana <harsha@minio.io>	4 years ago
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	5 years ago
Klaus Post	43d6e3ae06	merge object lifecycle checks into usage crawler (#9579 )	5 years ago
Harshavardhana	4790868878	allow background IAM load to speed up startup (#9796 ) Also fix healthcheck handler to run success only if object layer has initialized fully for S3 API access call.	5 years ago
Harshavardhana	febe9cc26a	fix: avoid timer leaks in dsync/lsync (#9781 ) At a customer setup with lots of concurrent calls it can be observed that in newRetryTimer there were lots of tiny alloations which are not relinquished upon retries, in this codepath we were only interested in re-using the timer and use it wisely for each locker. ``` (pprof) top Showing nodes accounting for 8.68TB, 97.02% of 8.95TB total Dropped 1198 nodes (cum <= 0.04TB) Showing top 10 nodes out of 79 flat flat% sum% cum cum% 5.95TB 66.50% 66.50% 5.95TB 66.50% time.NewTimer 1.16TB 13.02% 79.51% 1.16TB 13.02% github.com/ncw/directio.AlignedBlock 0.67TB 7.53% 87.04% 0.70TB 7.78% github.com/minio/minio/cmd.xlObjects.putObject 0.21TB 2.36% 89.40% 0.21TB 2.36% github.com/minio/minio/cmd.(posix).Walk 0.19TB 2.08% 91.49% 0.27TB 2.99% os.statNolog 0.14TB 1.59% 93.08% 0.14TB 1.60% os.(File).readdirnames 0.10TB 1.09% 94.17% 0.11TB 1.25% github.com/minio/minio/cmd.readDirN 0.10TB 1.07% 95.23% 0.10TB 1.07% syscall.ByteSliceFromString 0.09TB 1.03% 96.27% 0.09TB 1.03% strings.(Builder).grow 0.07TB 0.75% 97.02% 0.07TB 0.75% path.(lazybuf).append ```	5 years ago
Harshavardhana	5686a7e273	fix NAS gateway support for policy/notification (#9765 ) Fixes #9764	5 years ago
Harshavardhana	eba423bb9d	Disable crawler in FS/NAS gateway mode (#9695 ) No one really uses FS for large scale accounting usage, neither we crawl in NAS gateway mode. It is worthwhile to simply disable this feature as its not useful for anyone. Bonus disable bucket quota ops as well in, FS and gateway mode	5 years ago
Harshavardhana	7dbfea1353	avoid net/http ErrorLog for consistent logging experience (#9672 ) net/http exposes ErrorLog but it is log.Logger instance not an interface which can be overridden, because of this reason the logging is interleaved sometimes with TLS with messages like this on the server ``` http: TLS handshake error from 139.178.70.188:63760: EOF ``` This is bit problematic for us as we need to have consistent logging view for allow --json or --quiet flags. With this PR we ensure that this format is adhered to.	5 years ago
Harshavardhana	6656fa3066	simplify further bucket configuration properly (#9650 ) This PR is a continuation from #9586, now the entire parsing logic is fully merged into bucket metadata sub-system, simplify the quota API further by reducing the remove quota handler implementation.	5 years ago
Harshavardhana	bd032d13ff	migrate all bucket metadata into a single file (#9586 ) this is a major overhaul by migrating off all bucket metadata related configs into a single object '.metadata.bin' this allows us for faster bootups across 1000's of buckets and as well as keeps the code simple enough for future work and additions. Additionally also fixes #9396, #9394	5 years ago
kannappanr	a62572fb86	Check for address flags in all positions (#9615 ) Fixes #9599	5 years ago
Harshavardhana	a1de9cec58	cleanup object-lock/bucket tagging for gateways (#9548 ) This PR is to ensure that we call the relevant object layer APIs for necessary S3 API level functionalities allowing gateway implementations to return proper errors as NotImplemented{} This allows for all our tests in mint to behave appropriately and can be handled appropriately as well.	5 years ago
Harshavardhana	2dc46cb153	Report correct error when O_DIRECT is not supported (#9545 ) fixes #9537	5 years ago
Harshavardhana	4c9de098b0	heal buckets during init and make sure to wait on quorum (#9526 ) heal buckets properly during expansion, and make sure to wait for the quorum properly such that healing can be retried.	5 years ago
Harshavardhana	b768645fde	fix: unexpected logging with bucket metadata conversions (#9519 )	5 years ago
Harshavardhana	9b3b04ecec	allow retries for bucket encryption/policy quorum reloads (#9513 ) We should allow quorum errors to be send upwards such that caller can retry while reading bucket encryption/policy configs when server is starting up, this allows distributed setups to load the configuration properly. Current code didn't facilitate this and would have never loaded the actual configs during rolling, server restarts.	5 years ago
poornas	9a547dcbfb	Add API's for managing bucket quota (#9379 ) This PR allows setting a "hard" or "fifo" quota restriction at the bucket level. Buckets that have reached the FIFO quota configured, will automatically be cleaned up in FIFO manner until bucket usage drops to configured quota. If a bucket is configured with a "hard" quota ceiling, all further writes are disallowed.	5 years ago
Klaus Post	073aac3d92	add data update tracking using bloom filter (#9208 ) By monitoring PUT/DELETE and heal operations it is possible to track changed paths and keep a bloom filter for this data. This can help prioritize paths to scan. The bloom filter can identify paths that have not changed, and the few collisions will only result in a marginal extra workload. This can be implemented on either a bucket+(1 prefix level) with reasonable performance. The bloom filter is set to have a false positive rate at 1% at 1M entries. A bloom table of this size is about ~2500 bytes when serialized. To not force a full scan of all paths that have changed cycle bloom filters would need to be kept, so we guarantee that dirty paths have been scanned within cycle runs. Until cycle bloom filters have been collected all paths are considered dirty.	5 years ago
Harshavardhana	f14bf25cb9	optimize Listen bucket notification implementation (#9444 ) this commit avoids lots of tiny allocations, repeated channel creates which are performed when filtering the incoming events, unescaping a key just for matching. also remove deprecated code which is not needed anymore, avoids unexpected data structure transformations from the map to slice.	5 years ago
Nitish Tiwari	ebf3dda449	Update server startup example to showcase local erasure code (#9407 )	5 years ago
Klaus Post	f19cbfad5c	fix: use per test context (#9343 ) Instead of GlobalContext use a local context for tests. Most notably this allows stuff created to be shut down when tests using it is done. After PR #9345 9331 CI is often running out of memory/time.	5 years ago
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	5 years ago
Harshavardhana	e7276b7b9b	fix: make single locks for both IAM and object-store (#9279 ) Additionally add context support for IAM sub-system	5 years ago
Krishna Srinivas	541a778d7b	fix: do not exit on bootstrap Verify() to allow for rolling upgrades (#9235 )	5 years ago
Harshavardhana	6f992134a2	fix: startup load time by reusing storageDisks (#9210 )	5 years ago
Sidhartha Mani	0c80bf45d0	Implement oboard diagnostics admin API (#9024 ) - Implement a graph algorithm to test network bandwidth from every node to every other node - Saturate any network bandwidth adaptively, accounting for slow and fast network capacity - Implement parallel drive OBD tests - Implement a paging mechanism for OBD test to provide periodic updates to client - Implement Sys, Process, Host, Mem OBD Infos	5 years ago
Harshavardhana	6f6a2214fc	Add rate limiter for S3 API layer (#9196 ) - total number of S3 API calls per server - maximum wait duration for any S3 API call This implementation is primarily meant for situations where HDDs are not capable enough to handle the incoming workload and there is no way to throttle the client. This feature allows MinIO server to throttle itself such that we do not overwhelm the HDDs.	5 years ago
Harshavardhana	cfc9cfd84a	fix: various optimizations, idiomatic changes (#9179 ) - acquire since leader lock for all background operations - healing, crawling and applying lifecycle policies. - simplify lifecyle to avoid network calls, which was a bug in implementation - we should hold a leader and do everything from there, we have access to entire name space. - make listing, walking not interfere by slowing itself down like the crawler. - effectively use global context everywhere to ensure proper shutdown, in cache, lifecycle, healing - don't read `format.json` for prometheus metrics in StorageInfo() call.	5 years ago
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	5 years ago
Anis Elleuch	7fdeb44372	info: Initialize boot time early so uptime will always be correct (#9154 )	5 years ago
Harshavardhana	dcd63b4146	fix: avoid double ListBuckets() loading object lock (#9031 )	5 years ago
Nitish Tiwari	15e2ea2c96	Fix an issue where MinIO was logging every error twice (#8953 ) The logging subsystem was initialized under init() method in both gateway-main.go and server-main.go which are part of same package. This created two logging targets and hence errors were logged twice. This PR moves the init() method to common-main.go	5 years ago
Krishnan Parthasarathi	026265f8f7	Add support for bucket encryption feature (#8890 ) - pkg/bucket/encryption provides support for handling bucket encryption configuration - changes under cmd/ provide support for AES256 algorithm only Co-Authored-By: Poorna <poornas@users.noreply.github.com> Co-authored-by: Harshavardhana <harsha@minio.io>	5 years ago
Harshavardhana	d76160c245	Initialize only one retry timer for all sub-systems (#8913 ) Also make sure that we create buckets on all zones successfully, do not run quick heal buckets if not running with expansion.	5 years ago
Klaus Post	c7178d2066	Profiling: Add base, fix memory profiling (#8850 ) For 'snapshot' type profiles, record a 'before' profile that can be used as `go tool pprof -base=before ...` to compare before and after. "Before" profiles are included in the zipped package. [`runtime.MemProfileRate`](https://golang.org/pkg/runtime/#pkg-variables) should not be updated while the application is running, so we set it at startup. Co-authored-by: Harshavardhana <harsha@minio.io>	5 years ago
Harshavardhana	0879a4f743	rest/storage: Remove racy LastError usage (#8817 ) instead perform a liveness check call to verify if server is online and print relevant errors. Also introduce a StorageErr string error type instead of errors.New() deprecate usage of VerifyFileError, DeleteFileError for gob, change in datastructure also requires bump in storage REST version to v13. Fixes #8811	5 years ago
Klaus Post	37b32199e3	Validate XL sets on format (#8779 ) When formatting a set validate if a host failure will likely lead to data loss. While we don't know what config will be set in the future evaluate to our best knowledge, assuming default settings.	5 years ago
Harshavardhana	5aa5dcdc6d	lock: improve locker initialization at init (#8776 ) Use reference format to initialize lockers during startup, also handle `nil` for NetLocker in dsync and remove errorLocker implementation Add further tuning parameters such as - DialTimeout is now 15 seconds from 30 seconds - KeepAliveTimeout is not 20 seconds, 5 seconds more than default 15 seconds - ResponseHeaderTimeout to 10 seconds - ExpectContinueTimeout is reduced to 3 seconds - DualStack is enabled by default remove setting it to `true` - Reduce IdleConnTimeout to 30 seconds from 1 minute to avoid idleConn build up Fixes #8773	5 years ago
George Xie	7f31d933a8	fixes some typos, for CREDITS change (#8743 )	5 years ago
Harshavardhana	725172e13b	fix: Do not need safe-mode for unreachable targets upon restart (#8686 )	5 years ago

1 2 3 4 5 ...

313 Commits (76e2713ffe5f54fda843449e1d9bb7a169907f5d)