minio

Commit Graph

Author	SHA1	Message	Date
Klaus Post	cae09d8b84	crawler: Wait max 1 second (#9894 ) Add 1-second timeout to crawler wait. This will make the crawler able to run, albeit very, very slowly on high load servers.	4 years ago
Klaus Post	972d876ca9	Do not select zones with <5% free after upload (#9877 ) Looking into full disk errors on zoned setup. We don't take the 5% space requirement into account when selecting a zone. The interesting part is that even considering this we don't know the size of the object the user wants to upload when they do multipart uploads. It seems quite defensive to always upload multiparts to the zone where there is the most space since all load will be directed to a part of the cluster. In these cases we make sure it can at least hold a 1GiB file and we disadvantage fuller zones more by subtracting the expected size before weighing.	4 years ago
Harshavardhana	9626a981bc	fix: Preserve old data appropriately (#9873 ) This PR fixes all the below scenarios and handles them correctly. - existing data/bucket is replaced with new content, no versioning enabled old structure vanishes. - existing data/bucket - enable versioning before uploading any data, once versioning enabled upload new content, old content is preserved. - suspend versioning on the bucket again, now upload content again the old content is purged since that is the default "null" version. Additionally sync data after xl.json -> xl.meta rename(), to avoid any surprises if there is a crash during this rename operation.	4 years ago
Harshavardhana	94424e14d7	fix: rename legacy xl.json to xl.meta properly in ListDir() (#9863 )	4 years ago
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	5 years ago
Klaus Post	43d6e3ae06	merge object lifecycle checks into usage crawler (#9579 )	5 years ago
Klaus Post	142b057be8	Check object names on windows (#9798 ) Uploading files with names that could not be written to disk would result in "reduce your request" errors returned. Instead check explicitly for disallowed characters and reject files with `Object name contains unsupported characters.`	5 years ago
Harshavardhana	b2db8123ec	Preserve errors returned by diskInfo to detect disk errors (#9727 ) This PR basically reverts #9720 and re-implements it differently	5 years ago
Harshavardhana	0c71ce3398	fix size accounting for encrypted/compressed objects (#9690 ) size calculation in crawler was using the real size of the object instead of its actual size i.e either a decrypted or uncompressed size. this is needed to make sure all other accounting such as bucket quota and mcs UI to display the correct values.	5 years ago
Anis Elleuch	9baeda781a	fix storage info output with unordered endpoints arguments (#9610 ) Shuffling arguments that we pass to MinIO server are supported. However, when that happens, Prometheus returns wrong information about disks usage and online/offline status. The commit fixes the issue by avoiding relying on xl.endpoints since it is not ordered.	5 years ago
Klaus Post	ee9077db7d	fix: windows tests for all cases (#9594 ) Replaces #9299	5 years ago
Harshavardhana	6ac48a65cb	fix: use unused cacheMetrics code in prometheus (#9588 ) remove all other unusued/deadcode	5 years ago
Anis Elleuch	6885c72f32	disable check for DirectIO in standalone FS mode (#9558 )	5 years ago
Harshavardhana	2dc46cb153	Report correct error when O_DIRECT is not supported (#9545 ) fixes #9537	5 years ago
Harshavardhana	fea4a1e68e	fix logical error in path length handling for windows (#9520 ) fixes #9515	5 years ago
Anis Elleuch	3e063cca5c	Show the cause error in startup when directio is not supported (#9497 ) This commit tries to create a file using direct i/o in the startup so the server returns quickly and avoid cryptic other errors.	5 years ago
Harshavardhana	ab77b216d1	fix: remove restrictions on windows for NAME_MAX (#9469 ) Fixes #9393	5 years ago
Harshavardhana	498389123e	avoid unnecessary logging on fresh/newly replaced drives (#9470 ) data usage tracker and crawler seem to be logging non-actionable information on console, which is not useful and is fixed on its own in almost all deployments, lets keep this logging to minimal.	5 years ago
Anis Elleuch	c434dff0a4	posix: Add missing error return in RenameFile() (#9319 ) Although it should not happen in most cases.	5 years ago
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	5 years ago
Harshavardhana	e20e08d700	fix: remove the sleep from listing operations (#9287 ) make rest of the Walk() function more predictable, it was observed that in nominal deployments even without much workload the drives are generally slow for respond for readdir operations, for the sleepDuration factor of 10 this can cause unexpected slowness in the Listing calls, while it is good for all other I/O, it may simply slow down Listing immensely which is not useful. fixes #9261	5 years ago
Anis Elleuch	e51e465543	delete: Use physical Dir() for proper prefix cleanup in Windows (#9297 ) In FS mode under Windows, removing an object will not automatically. remove parent empty prefixes. The reason is that path.Dir() was used, however filepath.Dir() is more appropriate since filepath is physical (meaning it operates on OS filesystem paths) This is not caught because failure for Windows CI is not caught.	5 years ago
Bala FA	2c3e34f001	add force delete option of non-empty bucket (#9166 ) passing HTTP header `x-minio-force-delete: true` would allow standard S3 API DeleteBucket to delete a non-empty bucket forcefully.	5 years ago
Harshavardhana	6f992134a2	fix: startup load time by reusing storageDisks (#9210 )	5 years ago
Krishna Srinivas	45b1c66195	fix: implement splunk specific listObjects when delimiter=guidSplunk (#9186 )	5 years ago
Harshavardhana	cfc9cfd84a	fix: various optimizations, idiomatic changes (#9179 ) - acquire since leader lock for all background operations - healing, crawling and applying lifecycle policies. - simplify lifecyle to avoid network calls, which was a bug in implementation - we should hold a leader and do everything from there, we have access to entire name space. - make listing, walking not interfere by slowing itself down like the crawler. - effectively use global context everywhere to ensure proper shutdown, in cache, lifecycle, healing - don't read `format.json` for prometheus metrics in StorageInfo() call.	5 years ago
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	5 years ago
Krishna Srinivas	2e9fed1a14	non-empty dirs should not be listed as objects (#9129 )	5 years ago
Kody A Kantor	06e30b5aa1	Skip building directio on platforms that don't support Direct IO (#9059 )	5 years ago
Anis Elleuch	0af62d35a0	xl: Implement posix.DeletePrefixes to enhance delete perf (#9100 ) Bulk delete API was using cleanupObjectsBulk() which calls posix listing and delete API to remove objects internal files in the backend (xl.json and parts) one by one. Add DeletePrefixes in the storage API to remove the content of a directory in a single call. Also use a remove goroutine for each disk to accelerate removal.	5 years ago
Harshavardhana	88ae0f1196	Improve delete performance by reducing the number of calls (#9092 ) - Remove the requirement to honor storage class for deletes - Improve `posix.DeleteFileBulk` code to Stat the volumeDir only once per call, rather than for all object paths.	5 years ago
Harshavardhana	23a8411732	Add a generic Walk()'er to list a bucket, optinally prefix (#9026 ) This generic Walk() is used by likes of Lifecyle, or KMS to rotate keys or any other functionality which relies on this functionality.	5 years ago
Anis Elleuch	d4dcf1d722	metrics: Use StorageInfo() instead to have consistent info (#9006 ) Metrics used to have its own code to calculate offline disks. StorageInfo() was avoided because it is an expensive operation by sending calls to all nodes. To make metrics & server info share the same code, a new argument `local` is added to StorageInfo() so it will only query local disks when needed. Metrics now calls StorageInfo() as server info handler does but with the local flag set to false. Co-authored-by: Praveen raj Mani <praveen@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	5 years ago
Klaus Post	d0cea7adea	Fix stream read IO count (#8961 ) Streams are returning a readcloser and returning would decrement io count instantly, fix it. change maxActiveIOCount to 3, meaning it will pause crawling if 3 operations are running.	5 years ago
Harshavardhana	2d295a31de	Avoid select inside a recursive function to avoid CPU spikes (#8923 ) Additionally also allow configurable go-routines	5 years ago
Harshavardhana	f14f60a487	fix: Avoid double usage calculation on every restart (#8856 ) On every restart of the server, usage was being calculated which is not useful instead wait for sufficient time to start the crawling routine. This PR also avoids lots of double allocations through strings, optimizes usage of string builders and also avoids crawling through symbolic links. Fixes #8844	5 years ago
Harshavardhana	fc5213258e	posix: Do not take disk offline on I/O errors (#8836 ) Choosing maxAllowedIOError is arbitrary and prone to errors, when drives might be perfectly capable of taking I/O with only few locations return I/O error. This is a hindrance of sort where backend filesystems like ZFS can automatically fix and handle these scenarios. The added problem with current approach that we take the drive offline, making it virtually impossible to bring it online without restart the server which is not desirable on a busy cluster. Remove this state such that let the backend return error appropriately to caller and let the caller decide what to do with the error.	5 years ago
Anis Elleuch	c18fbdb29a	posix: Remove a non needed nil check in DiskInfo() (#8830 ) posix.DiskInfo() returns errFaultyDisk when posix is nil, but there is no way that this would happen any time, therefore removing un-needed code.	5 years ago
Harshavardhana	0879a4f743	rest/storage: Remove racy LastError usage (#8817 ) instead perform a liveness check call to verify if server is online and print relevant errors. Also introduce a StorageErr string error type instead of errors.New() deprecate usage of VerifyFileError, DeleteFileError for gob, change in datastructure also requires bump in storage REST version to v13. Fixes #8811	5 years ago
Klaus Post	37b32199e3	Validate XL sets on format (#8779 ) When formatting a set validate if a host failure will likely lead to data loss. While we don't know what config will be set in the future evaluate to our best knowledge, assuming default settings.	5 years ago
Harshavardhana	5aa5dcdc6d	lock: improve locker initialization at init (#8776 ) Use reference format to initialize lockers during startup, also handle `nil` for NetLocker in dsync and remove errorLocker implementation Add further tuning parameters such as - DialTimeout is now 15 seconds from 30 seconds - KeepAliveTimeout is not 20 seconds, 5 seconds more than default 15 seconds - ResponseHeaderTimeout to 10 seconds - ExpectContinueTimeout is reduced to 3 seconds - DualStack is enabled by default remove setting it to `true` - Reduce IdleConnTimeout to 30 seconds from 1 minute to avoid idleConn build up Fixes #8773	5 years ago
Harshavardhana	f68a7005c0	Improve disk formatting stage for large disk sets (#8690 )	5 years ago
Anis Elleuch	555969ee42	Add data usage collect with its new admin API (#8553 ) Admin data usage info API returns the following (Only FS & XL, for now) - Number of buckets - Number of objects - The total size of objects - Objects histogram - Bucket sizes	5 years ago
Nitish Tiwari	3df7285c3c	Add Support for Cache and S3 related metrics in Prometheus endpoint (#8591 ) This PR adds support below metrics - Cache Hit Count - Cache Miss Count - Data served from Cache (in Bytes) - Bytes received from AWS S3 - Bytes sent to AWS S3 - Number of requests sent to AWS S3 Fixes #8549	5 years ago
Harshavardhana	2ab8d5e47f	Enable build verification with race (#8583 )	5 years ago
Klaus Post	c7844fb1fb	posix: cache disk ID for a short while (#8564 ) `posix.getDiskID()` takes up to 30% of all CPU due to the `os.Stat` call on `GET` calls. Before: ``` Operation: GET - Concurrency: 12 Average: 1333.97 MB/s, 1365.99 obj/s, 1365.98 ops ended/s (4m59.975s) * First Byte: Average: 7.801487ms, Median: 7.9974ms, Best: 1.9822ms, Worst: 110.0021ms Aggregated, split into 299 x 1s time segments: * Fastest: 1453.50 MB/s, 1488.38 obj/s, 1492.00 ops ended/s (1s) * 50% Median: 1360.47 MB/s, 1393.12 obj/s, 1393.00 ops ended/s (1s) * Slowest: 978.68 MB/s, 1002.17 obj/s, 1004.00 ops ended/s (1s) ``` After: ``` Operation: GET - Concurrency: 12 * Average: 1706.07 MB/s, 1747.02 obj/s, 1747.01 ops ended/s (4m59.985s) * First Byte: Average: 5.797886ms, Median: 5.9959ms, Best: 996.3µs, Worst: 84.0007ms Aggregated, split into 299 x 1s time segments: * Fastest: 1830.03 MB/s, 1873.96 obj/s, 1872.00 ops ended/s (1s) * 50% Median: 1735.04 MB/s, 1776.68 obj/s, 1776.00 ops ended/s (1s) * Slowest: 994.94 MB/s, 1018.82 obj/s, 1018.00 ops ended/s (1s) ``` TLDR; `os.Stat` is not free.	5 years ago
Klaus Post	890b493a2e	Use random file name for write check (#8563 ) Since there may be multiple writes going on concurrently Use a random file name for the write check to avoid collisions.	5 years ago
Klaus Post	1dd38750f7	Remove read-ahead for small files (#8522 ) We should only read ahead if we are reading big files. We enable it for files >= 16MB. Benchmark on 64KB objects. Before: ``` Operation: GET Errors: 0 Average: 59.976s, 87.13 MB/s, 1394.07 ops ended/s. Fastest: 1s, 90.99 MB/s, 1455.00 ops ended/s. 50% Median: 1s, 87.53 MB/s, 1401.00 ops ended/s. Slowest: 1s, 81.39 MB/s, 1301.00 ops ended/s. ``` After: ``` Operation: GET Errors: 0 Average: 59.992s, 207.99 MB/s, 3327.85 ops ended/s. Fastest: 1s, 219.20 MB/s, 3507.00 ops ended/s. 50% Median: 1s, 210.54 MB/s, 3368.00 ops ended/s. Slowest: 1s, 179.14 MB/s, 2865.00 ops ended/s. ``` The 64KB buffer is actually a small disadvantage for this case, but I believe it will be better in general than no buffer.	5 years ago
Praveen raj Mani	fa325665b1	Do not append the endpoint for fs/xl disks in StorageInfo (#8472 )	5 years ago
Krishna Srinivas	980bf78b4d	Detect underlying disk mount/unmount (#8408 )	5 years ago

5 Commits (28a1a171872519ce03b154711475539e7f652802)