minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	734f258878	fix: slow down auto healing more aggressively (#10730 ) Bonus fixes - logging improvements to ensure that we don't use `go logger.LogIf` to avoid runtime.Caller missing the function name. log where necessary. - remove unused code at erasure sets	4 years ago
Anis Elleuch	284a2b9021	ilm: Send delete marker creation event when appropriate (#10696 ) Before this commit, the crawler ILM will always send object delete event notification though this is wrong.	4 years ago
Harshavardhana	2042d4873c	rename crawler config option to heal (#10678 )	4 years ago
Klaus Post	03991c5d41	crawler: Remove waitForLowActiveIO (#10667 ) Only use dynamic delays for the crawler. Even though the max wait was 1 second the number of waits could severely impact crawler speed. Instead of relying on a global metric, we use the stateless local delays to keep the crawler running at a speed more adjusted to current conditions. The only case we keep it is before bitrot checks when enabled.	4 years ago
Harshavardhana	a0d0645128	remove safeMode behavior in startup (#10645 ) In almost all scenarios MinIO now is mostly ready for all sub-systems independently, safe-mode is not useful anymore and do not serve its original intended purpose. allow server to be fully functional even with config partially configured, this is to cater for availability of actual I/O v/s manually fixing the server. In k8s like environments it will never make sense to take pod into safe-mode state, because there is no real access to perform any remote operation on them.	4 years ago
Harshavardhana	736e58dd68	fix: handle concurrent lockers with multiple optimizations (#10640 ) - select lockers which are non-local and online to have affinity towards remote servers for lock contention - optimize lock retry interval to avoid sending too many messages during lock contention, reduces average CPU usage as well - if bucket is not set, when deleteObject fails make sure setPutObjHeaders() honors lifecycle only if bucket name is set. - fix top locks to list out always the oldest lockers always, avoid getting bogged down into map's unordered nature.	4 years ago
Harshavardhana	eafa775952	fix: add lock ownership to expire locks (#10571 ) - Add owner information for expiry, locking, unlocking a resource - TopLocks returns now locks in quorum by default, provides a way to capture stale locks as well with `?stale=true` - Simplify the quorum handling for locks to avoid from storage class, because there were challenges to make it consistent across all situations. - And other tiny simplifications to reset locks.	4 years ago
poornas	aa12d75d75	fix crawler to detect lifecycle on bucket even if filter nil (#10532 )	4 years ago
Harshavardhana	1cf322b7d4	change leader locker only for crawler (#10509 )	4 years ago
Harshavardhana	d616d8a857	serialize replication and feed it through task model (#10500 ) this allows for eventually controlling the concurrency of replication and overally control of throughput	4 years ago
Klaus Post	fa01e640f5	Continous healing: add optional bitrot check (#10417 )	4 years ago
Anis Elleuch	af88772a78	lifecycle: NoncurrentVersionExpiration considers noncurrent version age (#10444 ) From https://docs.aws.amazon.com/AmazonS3/latest/dev/intro-lifecycle-rules.html#intro-lifecycle-rules-actions ``` When specifying the number of days in the NoncurrentVersionTransition and NoncurrentVersionExpiration actions in a Lifecycle configuration, note the following: It is the number of days from when the version of the object becomes noncurrent (that is, when the object is overwritten or deleted), that Amazon S3 will perform the action on the specified object or objects. Amazon S3 calculates the time by adding the number of days specified in the rule to the time when the new successor version of the object is created and rounding the resulting time to the next day midnight UTC. For example, in your bucket, suppose that you have a current version of an object that was created at 1/1/2014 10:30 AM UTC. If the new version of the object that replaces the current version is created at 1/15/2014 10:30 AM UTC, and you specify 3 days in a transition rule, the transition date of the object is calculated as 1/19/2014 00:00 UTC. ```	4 years ago
Harshavardhana	309b10f201	keep crawler cycle at 5 minutes	4 years ago
Klaus Post	c097ce9c32	continous healing based on crawler (#10103 ) Design: https://gist.github.com/klauspost/792fe25c315caf1dd15c8e79df124914	4 years ago
Anis Elleuch	6ae30b21c9	fix ILM should not remove a protected version (#10189 )	4 years ago
poornas	c43da3005a	Add support for server side bucket replication (#9882 )	4 years ago
Anis Elleuch	4cf80f96ad	fix: lifecycle XML parsing errors with Versioning (#9974 )	4 years ago
Anis Elleuch	d4af132fc4	lifecycle: Expiry should not delete versions (#9972 ) Currently, lifecycle expiry is deleting all object versions which is not correct, unless noncurrent versions field is specified. Also, only delete the delete marker if it is the only version of the given object.	4 years ago
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	5 years ago
Klaus Post	43d6e3ae06	merge object lifecycle checks into usage crawler (#9579 )	5 years ago
Harshavardhana	53aaa5d2a5	Export bucket usage counts as part of bucket metrics (#9710 ) Bonus fixes in quota enforcement to use the new datastructure and use timedValue to cache a value/reload automatically avoids one less global variable.	5 years ago
Harshavardhana	6e0575a53d	Revert "Disable crawler in FS/NAS gateway mode (#9695 )" (#9702 ) This reverts commit `eba423bb9d`. Additionally also address the FS crawler to properly calculate the sizes for encrypted/compressed content.	5 years ago
Harshavardhana	eba423bb9d	Disable crawler in FS/NAS gateway mode (#9695 ) No one really uses FS for large scale accounting usage, neither we crawl in NAS gateway mode. It is worthwhile to simply disable this feature as its not useful for anyone. Bonus disable bucket quota ops as well in, FS and gateway mode	5 years ago
Klaus Post	e25ace2151	Forward RPC errors from crawler (#9569 ) The `keepHTTPResponseAlive` would cause errors to be returned with status OK. - Add '32' as a filler byte until a response is ready - '0' to indicate the response is ready to be consumed - '1' to indicate response has an error which needs to be returned to the caller Clear out 'file not found' errors from dir walker, since it may be in a folder that has been deleted since it was scanned.	5 years ago
Harshavardhana	498389123e	avoid unnecessary logging on fresh/newly replaced drives (#9470 ) data usage tracker and crawler seem to be logging non-actionable information on console, which is not useful and is fixed on its own in almost all deployments, lets keep this logging to minimal.	5 years ago
Klaus Post	073aac3d92	add data update tracking using bloom filter (#9208 ) By monitoring PUT/DELETE and heal operations it is possible to track changed paths and keep a bloom filter for this data. This can help prioritize paths to scan. The bloom filter can identify paths that have not changed, and the few collisions will only result in a marginal extra workload. This can be implemented on either a bucket+(1 prefix level) with reasonable performance. The bloom filter is set to have a false positive rate at 1% at 1M entries. A bloom table of this size is about ~2500 bytes when serialized. To not force a full scan of all paths that have changed cycle bloom filters would need to be kept, so we guarantee that dirty paths have been scanned within cycle runs. Until cycle bloom filters have been collected all paths are considered dirty.	5 years ago
Harshavardhana	cfc9cfd84a	fix: various optimizations, idiomatic changes (#9179 ) - acquire since leader lock for all background operations - healing, crawling and applying lifecycle policies. - simplify lifecyle to avoid network calls, which was a bug in implementation - we should hold a leader and do everything from there, we have access to entire name space. - make listing, walking not interfere by slowing itself down like the crawler. - effectively use global context everywhere to ensure proper shutdown, in cache, lifecycle, healing - don't read `format.json` for prometheus metrics in StorageInfo() call.	5 years ago
Harshavardhana	b1a2169dcc	fix: data usage crawler env handling, usage-cache.bin location (#9163 ) canonicalize the ENVs such that we can bring these ENVs as part of the config values, as a subsequent change. - fix location of per bucket usage to `.minio.sys/buckets/<bucket_name>/usage-cache.bin` - fix location of the overall usage in `json` at `.minio.sys/buckets/.usage.json` (avoid conflicts with a bucket named `usage.json` ) - fix location of the overall usage in `msgp` at `.minio.sys/buckets/.usage.bin` (avoid conflicts with a bucket named `usage.bin`	5 years ago
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	5 years ago
Anis Elleuch	75a0661213	data-usage: Fix the calculation of the next crawling round (#9096 ) This commit fixes a simple typo miscalculated the waiting time until the next round of data crawling to compute the data usage.	5 years ago
kannappanr	d9be8bc693	Add env. variable to disable data usage crawling (#9086 )	5 years ago
Klaus Post	b2db1e96e2	Remove crawler concurrency (#9023 ) Only have one crawler per disk. Removes locking, but keep fastwalk itself able to run concurrently.	5 years ago
Harshavardhana	ab7d3cd508	fix: Speed up multi-object delete by taking bulk locks (#8974 ) Change distributed locking to allow taking bulk locks across objects, reduces usually 1000 calls to 1. Also allows for situations where multiple clients sends delete requests to objects with following names ``` {1,2,3,4,5} ``` ``` {5,4,3,2,1} ``` will block and ensure that we do not fail the request on each other.	5 years ago
Klaus Post	2165d45d3f	Time getSize and use to estimate latency (#8959 ) Remove the random sleep. This is running in 4 goroutines, so mostly doing nothing. We use the getSize latency to estimate system load, meaning when there is little load on the system and we get the result fast we sleep a little. If it took a long time we have high load and release ourselves longer. We are sleeping inside the mutex so this affects all goroutines doing IO.	5 years ago
Harshavardhana	49df290270	Add metadata parsing to be inside mutex to slow down (#8952 ) Adding mutex slows down the crawler to avoid large spikes in CPU, also add millisecond interval jitter in calculation of disk usage to slow down the spikes further.	5 years ago
Harshavardhana	2d295a31de	Avoid select inside a recursive function to avoid CPU spikes (#8923 ) Additionally also allow configurable go-routines	5 years ago
Harshavardhana	f14f60a487	fix: Avoid double usage calculation on every restart (#8856 ) On every restart of the server, usage was being calculated which is not useful instead wait for sufficient time to start the crawling routine. This PR also avoids lots of double allocations through strings, optimizes usage of string builders and also avoids crawling through symbolic links. Fixes #8844	5 years ago
Anis Elleuch	017067e11f	data-usage: Avoid crawling duplicated call (#8843 ) This fix will also picks 3 and not 4 disks from a single erasure set.	5 years ago
Harshavardhana	aa2e89bfe3	Use jsoniter whenever applicable instead of encoding/json (#8766 ) This PR adds jsoniter package to replace encoding/json in places where faster json unmarshal is necessary whenever input JSON is large enough. Some benchmarking comparison between jsoniter and enconding/json benchmark old MB/s new MB/s speedup BenchmarkParseUnmarshal/N10-4 110.02 331.17 3.01x BenchmarkParseUnmarshal/N100-4 125.74 524.09 4.17x BenchmarkParseUnmarshal/N500-4 131.68 542.60 4.12x BenchmarkParseUnmarshal/N1000-4 133.93 514.88 3.84x BenchmarkParseUnmarshal/N5000-4 122.10 415.36 3.40x BenchmarkParseUnmarshal/N10000-4 132.13 403.90 3.06x	5 years ago
Harshavardhana	5f2318567e	Allow metadata updates on meta bucket even in WORM mode (#8657 ) This ensures that we can update the - .minio.sys is updated for accounting/data usage purposes - .minio.sys is updated to indicate if backend is encrypted or not.	5 years ago
Harshavardhana	c8d82588c2	Fix crash in console logger and also handle bucket DNS updates (#8654 ) Also fix listenBucketNotification bugs seen by minio-js listen bucket notification API.	5 years ago
Anis Elleuch	555969ee42	Add data usage collect with its new admin API (#8553 ) Admin data usage info API returns the following (Only FS & XL, for now) - Number of buckets - Number of objects - The total size of objects - Objects histogram - Bucket sizes	5 years ago

20 Commits (4c773f7068fc7fd058c701f29b37cb2a3088e72f)