minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	fb96779a8a	Add large bucket support for erasure coded backend (#5160 ) This PR implements an object layer which combines input erasure sets of XL layers into a unified namespace. This object layer extends the existing erasure coded implementation, it is assumed in this design that providing > 16 disks is a static configuration as well i.e if you started the setup with 32 disks with 4 sets 8 disks per pack then you would need to provide 4 sets always. Some design details and restrictions: - Objects are distributed using consistent ordering to a unique erasure coded layer. - Each pack has its own dsync so locks are synchronized properly at pack (erasure layer). - Each pack still has a maximum of 16 disks requirement, you can start with multiple such sets statically. - Static sets set of disks and cannot be changed, there is no elastic expansion allowed. - Static sets set of disks and cannot be changed, there is no elastic removal allowed. - ListObjects() across sets can be noticeably slower since List happens on all servers, and is merged at this sets layer. Fixes #5465 Fixes #5464 Fixes #5461 Fixes #5460 Fixes #5459 Fixes #5458 Fixes #5460 Fixes #5488 Fixes #5489 Fixes #5497 Fixes #5496	7 years ago
Aditya Manthramurthy	a337ea4d11	Move admin APIs to new path and add redesigned heal APIs (#5351 ) - Changes related to moving admin APIs - admin APIs now have an endpoint under /minio/admin - admin APIs are now versioned - a new API to server the version is added at "GET /minio/admin/version" and all API operations have the path prefix /minio/admin/v1/<operation> - new service stop API added - credentials change API is moved to /minio/admin/v1/config/credential - credentials change API and configuration get/set API now require TLS so that credentials are protected - all API requests now receive JSON - heal APIs are disabled as they will be changed substantially - Heal API changes Heal API is now provided at a single endpoint with the ability for a client to start a heal sequence on all the data in the server, a single bucket, or under a prefix within a bucket. When a heal sequence is started, the server returns a unique token that needs to be used for subsequent 'status' requests to fetch heal results. On each status request from the client, the server returns heal result records that it has accumulated since the previous status request. The server accumulates upto 1000 records and pauses healing further objects until the client requests for status. If the client does not request any further records for a long time, the server aborts the heal sequence automatically. A heal result record is returned for each entity healed on the server, such as system metadata, object metadata, buckets and objects, and has information about the before and after states on each disk. A client may request to force restart a heal sequence - this causes the running heal sequence to be aborted at the next safe spot and starts a new heal sequence.	7 years ago
Nitish Tiwari	ede504400f	Add validation of xlMeta ErasureInfo field (#5389 )	7 years ago
Harshavardhana	d45a8784fc	Fix hash order to generate more even distribution (#5247 ) The problem in existing code was the following line ``` start := int(keyCrc%uint32(cardinality)) \| 1 ``` A given a value of N cardinality the ending result because of the the bitwise '\|' would lead to always higher affinity to odd sequences. As can be seen from the test cases that this can lead to many objects being allocated the same set of disks or atleast the first disk is an odd disk always. This introduces a performance problem for majority of the objects under concurrent load. Remove `\| 1` to provide a more cleaner distribution and the new code will be. ``` start := int(keyCrc % uint32(cardinality)) ``` Thanks to Krishna Srinivas for pointing out the bitwise situation here.	7 years ago
Harshavardhana	8efa82126b	Convert errors tracer into a separate package (#5221 )	7 years ago
Harshavardhana	0b546ddfd4	Return errors in PutObject()/PutObjectPart() if input size is -1. (#5015 ) Amazon S3 API expects all incoming stream has a content-length set it was superflous for us to support object layer which supports unknown sized stream as well, this PR removes such requirements and explicitly error out if input stream is less than zero.	7 years ago
Harshavardhana	2e6ee68409	fix: [minor] Avoid unnecessary typecasting. (#4828 ) We don't need to typecast identifiers from their base to type to same type again. This is not a bug and compiler is fine to skip it but it is better to avoid if not needed.	7 years ago
Frank Wessels	a2f2044528	Minor corrections in comments for xl utils (#4815 )	7 years ago
Andreas Auernhammer	85fcee1919	erasure: simplify XL backend operations (#4649 ) (#4758 ) This change provides new implementations of the XL backend operations: - create file - read file - heal file Further this change adds table based tests for all three operations. This affects also the bitrot algorithm integration. Algorithms are now integrated in an idiomatic way (like crypto.Hash). Fixes #4696 Fixes #4649 Fixes #4359	7 years ago
Frank Wessels	46897b1100	Name return values to prevent the need (and unnecessary code bloat) (#4576 ) This is done to explicitly instantiate objects for every return statement.	8 years ago
Anis Elleuch	af8071c86a	xl: Fix rare freeze after many disk/network errors (#4438 ) xl.storageDisks is sometimes passed to some low-level XL functions. Some disks in xl.storageDisks are set to nil when they encounter some errors. This means all elements in xl.storageDisks will be nil after some time which lead to an unusable XL.	8 years ago
Aditya Manthramurthy	8975da4e84	Add new ReadFileWithVerify storage-layer API (#4349 ) This is an enhancement to the XL/distributed-XL mode. FS mode is unaffected. The ReadFileWithVerify storage-layer call is similar to ReadFile with the additional functionality of performing bit-rot checking. It accepts additional parameters for a hashing algorithm to use and the expected hex-encoded hash string. This patch provides significant performance improvement because: 1. combines the step of reading the file (during erasure-decoding/reconstruction) with bit-rot verification; 2. limits the number of file-reads; and 3. avoids transferring the file over the network for bit-rot verification. ReadFile API is implemented as ReadFileWithVerify with empty hashing arguments. Credits to AB and Harsha for the algorithmic improvement. Fixes #4236.	8 years ago
Harshavardhana	155a90403a	fs/erasure: Rename meta 'md5Sum' as 'etag'. (#4319 ) This PR also does backend format change to 1.0.1 from 1.0.0. Backward compatible changes are still kept to read the 'md5Sum' key. But all new objects will be stored with the same details under 'etag'. Fixes #4312	8 years ago
Krishnan Parthasarathi	417ec0df56	HealObject should succeed when only N/2 disks have data (#3952 )	8 years ago
Harshavardhana	bcc5b6e1ef	xl: Rename getOrderedDisks as shuffleDisks appropriately. (#3796 ) This PR is for readability cleanup - getOrderedDisks as shuffleDisks - getOrderedPartsMetadata as shufflePartsMetadata Distribution is now a second argument instead being the primary input argument for brevity. Also change the usage of type casted int64(0), instead rely on direct type reference as `var variable int64` everywhere.	8 years ago
Harshavardhana	6a6c930f5b	xl: Abort multipart upload should honor quorum properly. (#3670 ) Current implementation didn't honor quorum properly and didn't handle the errors generated properly. This patch addresses that and also moves common code `cleanupMultipartUploads` into xl specific private function. Fixes #3665	8 years ago
Harshavardhana	1b30a3be2b	xl/utils: getPartSizeFromIdx should return error. (#3669 )	8 years ago
Anis Elleuch	e9394dc22d	xl PutObject: Split object into parts (#3651 ) For faster time-to-first-byte when we try to download a big object	8 years ago
Krishnan Parthasarathi	c194b9f5f1	Implement mgmt REST APIs for heal subcommands (#3533 ) The heal APIs supported in this change are, - listing of objects to be healed. - healing a bucket. - healing an object.	8 years ago
Bala FA	0f2e493c9a	Use isErrIgnored() function wherever applicable. (#3343 )	8 years ago
Harshavardhana	5197649081	utils: reduceErrs returns and validates quorum errors. (#3300 ) This is needed as explained by @krisis Lets say we have following errors. ``` []error{nil, errFileNotFound, errDiskAccessDenied, errDiskAccesDenied} ``` Since the last two errors are filtered, the maximum is nil, depending on map order. Let's say we get nil from reduceErr. Clearly at this point we don't have quorum nodes agreeing about the data and since GetObject only requires N/2 (Read quorum) and isDiskQuorum would have returned true. This is problematic and can lead to undersiable consequences. Fixes #3298	8 years ago
Harshavardhana	0b9f0d14a1	auth/rpc: Take remote disk offline after maximum allowed attempts. (#3288 ) Disks when are offline for a long period of time, we should ignore the disk after trying Login upto 5 times. This is to reduce the network chattiness, this also reduces the overall time spent on `net.Dial`. Fixes #3286	8 years ago
Karthic Rao	8bd78fbdfb	performance: gjson parsing for readXLMeta, listParts, getObjectInfo. (#2631 ) - Using gjson for constructing xlMetaV1{} in realXLMeta. - Test for parsing constructing xlMetaV1{} using gjson. - Changes made since benchmarks showed 30-40% improvement in speed. - Follow up comments in issue https://github.com/minio/minio/issues/2208 for more details. - gjson parsing of parts from xl.json for listParts. - gjson parsing of statInfo from xl.json for getObjectInfo. - Vendorizing gjson dependency.	8 years ago
Krishna Srinivas	9358ee011b	logging: Print stack trace in case of errors. fixes #1827	8 years ago
Harshavardhana	bccf549463	server: Move all the top level files into cmd folder. (#2490 ) This change brings a change which was done for the 'mc' package to allow for clean repo and have a cleaner github drop in experience.	8 years ago
karthic rao	5fe72cf205	Removing readAllMeta from xl-v1-healing.go and placing it in xl-v1-utils.go (#2296 )	8 years ago
Harshavardhana	5d118141cd	XL: Remove deadcode unionChecksumInfo. (#2261 )	8 years ago
Harshavardhana	cef26fd6ea	XL: Refactor usage of reduceErrs and consistent behavior. (#2240 ) This refactor is also needed in lieu of our quorum requirement change for the newly understood logic behind klauspost/reedsolom implementation.	8 years ago
Krishna Srinivas	8cc163e51a	Refactor xl.GetObject and erasureReadFile. (#2211 ) * XL: Refactor xl.GetObject and erasureReadFile. erasureReadFile() responsible for just erasure coding, it takes ordered disks and checkSum slice. * move getOrderedPartsMetadata and getOrderedDisks to xl-v1-utils.go * Review fixes.	8 years ago
Krishnan Parthasarathi	45240f158d	xl: Make namespace locking granular for PutObject (#2199 )	8 years ago
Harshavardhana	dc3bafb194	XL: isQuorum rename as isDiskQuorum, word it properly. (#2196 )	8 years ago
Krishnan Parthasarathi	0610527868	XL: PutObjectPart update checksum, re-read from xl.json for the part being written. (#2191 )	8 years ago
Harshavardhana	623e0f9243	XL: listOnlineDisks should use modTime instead of version. (#2166 ) This change is needed to make reading from objects future proof in-terms of handling online disks. Our current counter is not based on affirmative knowledge and relies on arithmetic sequence which can lead to bugs. Using modTime simplifies the understanding of `xl.json` and future tooling / debugging of the format.	8 years ago
Krishnan Parthasarathi	bc8720406d	Added specific error for InvalidObjectName (#2157 )	9 years ago
Krishna Srinivas	ae80f8ca35	ObjectLayer/GetObject: Should return the right error value. Fix done in FS and XL. (#2133 ) fixes #2117	9 years ago
frankw	63b3f1dcfd	Use new algorithm to get fixed random order of disks (#2147 )	9 years ago
Harshavardhana	42286cba70	XL: Implement new ReadAll API for files which are read in single call. (#1974 ) Add a unit test as well.	9 years ago
Harshavardhana	e8990e42c2	XL: Make allocations simpler avoid redundant allocs. (#1961 ) - Reduce 10MiB buffers for loopy calls to use 128KiB. - start using 128KiB buffer where needed.	9 years ago
Harshavardhana	8c0942bf0d	XL: Remove usage of reduceErr and make it isQuorum verification. (#1909 ) Fixes #1908	9 years ago
Harshavardhana	fb95c1fad3	XL: Bring in some modularity into format verification and healing. (#1832 )	9 years ago
Harshavardhana	445dc22118	XL: Cleanup and add more comments. (#1807 )	9 years ago
Harshavardhana	b2293c2bf4	XL: Rename, cleanup and add more comments. (#1769 ) - xl-v1-bucket.go - removes a whole bunch of code. - {xl-v1,fs-v1}-metadata.go - add a lot of comments and rename functions appropriately.	9 years ago
Harshavardhana	553fdb9211	XL: Bring in support for object versions written during writeQuorum. (#1762 ) Erasure is initialized as needed depending on the quorum and onlineDisks. This way we can manage the quorum at the object layer.	9 years ago
Harshavardhana	4e34e03dd4	xl/fs: Split object layer into interface. (#1415 )	9 years ago
Harshavardhana	a1a667ae5d	xl: Change fileMetadata to xlMetadata. (#1404 ) Finalized backend format ``` { "version": "1.0.0", "stat": { "size": 24256, "modTime": "2016-04-28T00:11:37.843Z" }, "erasure": { "data": 5, "parity": 5, "blockSize": 4194304 ], "minio": { "release": "RELEASE.2016-04-28T00-09-47Z" } } ```	9 years ago
Krishna Srinivas	8c85815106	xl: refactor functions to xl-v1-common.go xl-v1-utils.go. (#1357 )	9 years ago
Krishna Srinivas	becc814531	Xl layer selfheal quorum2 * xl/selfheal: selfheal based on read quorum on GET * xl: getReadableDisks() also returns whether self-heal is needed so that this info can be used by ReadFile/SelfHeal/StatFile. * xl: trigger selfheal from StatFile.	9 years ago
Harshavardhana	9bd9441107	xl: Simplify reading metadata and add a new fileMetadata type. (#1346 )	9 years ago
Krishna Srinivas	5c33b68318	xl: code refactor, cleanup ReadFile and CreateFile.	9 years ago

25 Commits (fb96779a8a141e1012590841ef2cc7f0e4207eb5)