minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	f903cae6ff	Support variable server pools (#11256 ) Current implementation requires server pools to have same erasure stripe sizes, to facilitate same SLA and expectations. This PR allows server pools to be variadic, i.e they do not have to be same erasure stripe sizes - instead they should have SLA for parity ratio. If the parity ratio cannot be guaranteed by the new server pool, the deployment is rejected i.e server pool expansion is not allowed.	4 years ago
Harshavardhana	e7ae49f9c9	fix: calculate prometheus disks_offline/disks_total correctly (#11215 ) fixes #11196	4 years ago
Harshavardhana	c19e6ce773	avoid a crash in crawler when lifecycle is not initialized (#11170 ) Bonus for static buffers use bytes.NewReader instead of bytes.NewBuffer, to use a more reader friendly implementation	4 years ago
Anis Elleuch	2ecaab55a6	admin: ServerInfo returns info without object layer initialized (#11142 )	4 years ago
Harshavardhana	db7890660e	fix: a crash when disk is nil, safe access on erasureDisks (#11089 ) fixes #11088	4 years ago
Harshavardhana	ce93b2681b	fix: re-use er.getDisks() properly in certain calls (#11043 )	4 years ago
Klaus Post	e6ea5c2703	crawler: Missing folder heal check per set (#10876 )	4 years ago
Klaus Post	2294e53a0b	Don't retain context in locker (#10515 ) Use the context for internal timeouts, but disconnect it from outgoing calls so we always receive the results and cancel it remotely.	4 years ago
Harshavardhana	d9db7f3308	expire lockers if lockers are offline (#10749 ) lockers currently might leave stale lockers, in unknown ways waiting for downed lockers. locker check interval is high enough to safely cleanup stale locks.	4 years ago
Klaus Post	21a549a83b	fix: keep MRF channel open to avoid random CI crash (#10686 ) There doesn't seem to be any benefit to closing the channel, so just keep it open and let it die with the server.	4 years ago
Harshavardhana	f9be783f3e	fix: allow crawler to crawl on disks without usage constraints (#10677 ) additionally also change the resolution usage wise return of disks, allows to small byte level differences to be masked.	4 years ago
Harshavardhana	6484453fc6	optionally allow strict quorum listing (#10649 ) ``` export MINIO_API_LIST_STRICT_QUORUM=on ``` would enable listing in quorum if necessary	4 years ago
Harshavardhana	c6a9a94f94	fix: optimize ServerInfo() handler to avoid reading config (#10626 ) fixes #10620	4 years ago
Harshavardhana	66174692a2	add '.healing.bin' for tracking currently healing disk (#10573 ) add a hint on the disk to allow for tracking fresh disk being healed, to allow for restartable heals, and also use this as a way to track and remove disks. There are more pending changes where we should move all the disk formatting logic to backend drives, this PR doesn't deal with this refactor instead makes it easier to track healing in the future.	4 years ago
Harshavardhana	eafa775952	fix: add lock ownership to expire locks (#10571 ) - Add owner information for expiry, locking, unlocking a resource - TopLocks returns now locks in quorum by default, provides a way to capture stale locks as well with `?stale=true` - Simplify the quorum handling for locks to avoid from storage class, because there were challenges to make it consistent across all situations. - And other tiny simplifications to reset locks.	4 years ago
Harshavardhana	e60834838f	fix: background disk heal, to reload format consistently (#10502 ) It was observed in VMware vsphere environment during a pod replacement, `mc admin info` might report incorrect offline nodes for the replaced drive. This issue eventually goes away but requires quite a lot of time for all servers to be in sync. This PR fixes this behavior properly.	4 years ago
Klaus Post	493c714663	Remove erasureSets and erasureObjects from ObjectLayer (#10442 )	4 years ago
Klaus Post	2d58a8d861	Add storage layer contexts (#10321 ) Add context to all (non-trivial) calls to the storage layer. Contexts are propagated through the REST client. - `context.TODO()` is left in place for the places where it needs to be added to the caller. - `endWalkCh` could probably be removed from the walkers, but no changes so far. The "dangerous" part is that now a caller disconnecting will propagate down, so a "delete" operation will now be interrupted. In some cases we might want to disconnect this functionality so the operation completes if it has started, leaving the system in a cleaner state.	4 years ago
Harshavardhana	8a291e1dc0	Cluster healthcheck improvements (#10408 ) - do not fail the healthcheck if heal status was not obtained from one of the nodes, if many nodes fail then report this as a catastrophic error. - add "x-minio-write-quorum" value to match the write tolerance supported by server. - admin info now states if a drive is healing where madmin.Disk.Healing is set to true and madmin.Disk.State is "ok"	4 years ago
Klaus Post	1b119557c2	getDisksInfo: Attribute failed disks to correct endpoint (#10360 ) If DiskInfo calls failed the information returned was used anyway resulting in no endpoint being set. This would make the drive be attributed to the local system since `disk.Endpoint == disk.DrivePath` in that case. Instead, if the call fails record the endpoint and the error only.	4 years ago
Klaus Post	17a1eda702	Disregard healing disks in crawling (#10349 ) When crawling never use a disk we know is healing. Most of the change involves keeping track of the original endpoint on xlStorage and this also fixes DiskInfo.Endpoint never being populated. Heal master will print `data-crawl: Disk "http://localhost:9001/data/mindev/data2/xl1" is Healing, skipping` once on a cycle (no more often than every 5m).	4 years ago
Klaus Post	c097ce9c32	continous healing based on crawler (#10103 ) Design: https://gist.github.com/klauspost/792fe25c315caf1dd15c8e79df124914	4 years ago
Harshavardhana	74116204ce	handle fresh setup with mixed drives (#10273 ) fresh drive setups when one of the drive is a root drive, we should ignore such a root drive and not proceed to format. This PR handles this properly by marking the disks which are root disk and they are taken offline.	4 years ago
Harshavardhana	b32d0a5b60	use the correct endpoints for offline drives	4 years ago
Harshavardhana	a20d4568a2	fix: make sure to use uniform drive count calculation (#10208 ) It is possible in situations when server was deployed in asymmetric configuration in the past such as ``` minio server ~/fs{1...4}/disk{1...5} ``` Results in setDriveCount of 10 in older releases but with fairly recent releases we have moved to having server affinity which means that a set drive count ascertained from above config will be now '4' While the object layer make sure that we honor `format.json` the storageClass configuration however was by mistake was using the global value obtained by heuristics. Which leads to prematurely using lower parity without being requested by the an administrator. This PR fixes this behavior.	4 years ago
Harshavardhana	b16781846e	allow server to start even with corrupted/faulty disks (#10175 )	4 years ago
Harshavardhana	ec06089eda	fix: re-implement cluster healthcheck (#10101 )	4 years ago
Harshavardhana	d3c81a6e93	add missing available space from metrics (#10065 )	4 years ago
Harshavardhana	e7d7d5232c	fix: admin info output and improve overall performance (#10015 ) - admin info node offline check is now quicker - admin info now doesn't duplicate the code across doing the same checks for disks - rely on StorageInfo to return appropriate errors instead of calling locally. - diskID checks now return proper errors when disk not found v/s format.json missing. - add more disk states for more clarity on the underlying disk errors.	4 years ago
Harshavardhana	a38ce29137	fix: simplify background heal and trigger heal items early (#9928 ) Bonus fix during versioning merge one of the PR was missing the offline/online disk count fix from #9801 port it correctly over to the master branch from release. Additionally, add versionID support for MRF Fixes #9910 Fixes #9931	4 years ago
Klaus Post	1813ff9dfa	Re-add missing bucket bloom filters (#9861 )	5 years ago
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	5 years ago
Klaus Post	4a007e3767	Prefer local disks when fetching data blocks (#9563 ) If the requested server is part of the set this will always read from the local disk, even if the disk contains a parity shard. In default setup there is a 50% chance that at least one shard that otherwise would have been fetched remotely will be read locally instead. It basically trades RPC call overhead for reed-solomon. On distributed localhost this seems to be fairly break-even, with a very small gain in throughput and latency. However on networked servers this should be a bigger 1MB objects, before: ``` Operation: GET. Concurrency: 32. Hosts: 4. Requests considered: 76257: * Avg: 25ms 50%: 24ms 90%: 32ms 99%: 42ms Fastest: 7ms Slowest: 67ms * First Byte: Average: 23ms, Median: 22ms, Best: 5ms, Worst: 65ms Throughput: * Average: 1213.68 MiB/s, 1272.63 obj/s (59.948s, starting 14:45:44 CEST) ``` After: ``` Operation: GET. Concurrency: 32. Hosts: 4. Requests considered: 78845: * Avg: 24ms 50%: 24ms 90%: 31ms 99%: 39ms Fastest: 8ms Slowest: 62ms * First Byte: Average: 22ms, Median: 21ms, Best: 6ms, Worst: 57ms Throughput: * Average: 1255.11 MiB/s, 1316.08 obj/s (59.938s, starting 14:43:58 CEST) ``` Bonus fix: Only ask for heal once on an object.	5 years ago
Klaus Post	d9e7cadacf	Update reed+solomon (#9562 ) Only create encoder when strictly needed.	5 years ago
iliul	d3f9f8be88	golint: fix redundant code logic (#7842 ) Signed-off-by: Lei Liu <liul.stone@gmail.com>	5 years ago
Praveen raj Mani	c113d4e49c	Posix CreateFile should work for compressed lengths (#7584 )	6 years ago
kannappanr	5ecac91a55	Replace Minio refs in docs with MinIO and links (#7494 )	6 years ago
Harshavardhana	8e0910ab3e	Fix build issues on BSDs in pkg/cpu (#7116 ) Also add a cross compile script to test always cross compilation for some well known platforms and architectures , we support out of box compilation of these platforms even if we don't make an official release build. This script is to avoid regressions in this area when we add platform dependent code.	6 years ago
Krishna Srinivas	98c950aacd	Streaming bitrot verification support (#7004 )	6 years ago
Krishna Srinivas	52f6d5aafc	Rename of structs and methods (#6230 ) Rename of ErasureStorage to Erasure (and rename of related variables and methods)	6 years ago
Krishna Srinivas	ce02ab613d	Simplify erasure code by separating bitrot from erasure code (#5959 )	6 years ago
kannappanr	f8a3fd0c2a	Create logger package and rename errorIf to LogIf (#5678 ) Removing message from error logging Replace errors.Trace with LogIf	7 years ago
Harshavardhana	c0721164be	Automatically set goroutines based on shardSize (#5346 ) Update reedsolomon library to enable feature to automatically set number of go-routines based on the input shard size, since shard size is sort of a constant in Minio for objects > 10MiB (default blocksize) klauspost reported around 15-20% improvement in performance numbers on older systems such as AVX and SSE3 ``` name old speed new speed delta Encode10x2x10000-8 5.45GB/s ± 1% 6.22GB/s ± 1% +14.20% (p=0.000 n=9+9) Encode100x20x10000-8 1.44GB/s ± 1% 1.64GB/s ± 1% +13.77% (p=0.000 n=10+10) Encode17x3x1M-8 10.0GB/s ± 5% 12.0GB/s ± 1% +19.88% (p=0.000 n=10+10) Encode10x4x16M-8 7.81GB/s ± 5% 8.56GB/s ± 5% +9.58% (p=0.000 n=10+9) Encode5x2x1M-8 15.3GB/s ± 2% 19.6GB/s ± 2% +28.57% (p=0.000 n=9+10) Encode10x2x1M-8 12.2GB/s ± 5% 15.0GB/s ± 5% +22.45% (p=0.000 n=10+10) Encode10x4x1M-8 7.84GB/s ± 1% 9.03GB/s ± 1% +15.19% (p=0.000 n=9+9) Encode50x20x1M-8 1.73GB/s ± 4% 2.09GB/s ± 4% +20.59% (p=0.000 n=10+9) Encode17x3x16M-8 10.6GB/s ± 1% 11.7GB/s ± 4% +10.12% (p=0.000 n=8+10) ```	7 years ago
Harshavardhana	8efa82126b	Convert errors tracer into a separate package (#5221 )	7 years ago
Andreas Auernhammer	85fcee1919	erasure: simplify XL backend operations (#4649 ) (#4758 ) This change provides new implementations of the XL backend operations: - create file - read file - heal file Further this change adds table based tests for all three operations. This affects also the bitrot algorithm integration. Algorithms are now integrated in an idiomatic way (like crypto.Hash). Fixes #4696 Fixes #4649 Fixes #4359	7 years ago

45 Commits (8da0b7cf0394a7bdd6226e4ebe89a310f1ae0e70)