doc: Merge large bucket with distributed docs (#7761)

master
Harshavardhana 6 years ago committed by kannappanr
parent d90d4841b8
commit a075015293
  1. 0
      docs/distributed/DESIGN.md
  2. 32
      docs/distributed/README.md
  3. 47
      docs/large-bucket/README.md
  4. BIN
      docs/screenshots/Architecture-diagram_distributed_32.png

@ -14,23 +14,23 @@ Distributed MinIO provides protection against multiple node/drive failures and [
A stand-alone MinIO server would go down if the server hosting the disks goes offline. In contrast, a distributed MinIO setup with _n_ disks will have your data safe as long as _n/2_ or more disks are online. You'll need a minimum of _(n/2 + 1)_ [Quorum](https://github.com/minio/dsync#lock-process) disks to create new objects though. A stand-alone MinIO server would go down if the server hosting the disks goes offline. In contrast, a distributed MinIO setup with _n_ disks will have your data safe as long as _n/2_ or more disks are online. You'll need a minimum of _(n/2 + 1)_ [Quorum](https://github.com/minio/dsync#lock-process) disks to create new objects though.
For example, an 8-node distributed MinIO setup with 1 disk per node would continue serving files, even if up to 4 disks are offline. But, you'll need at least 5 disks online to create new objects. For example, an 16-node distributed MinIO setup with 16 disks per node would continue serving files, even if up to 8 servers are offline. But, you'll need at least 9 servers online to create new objects.
### Limits ### Limits
As with MinIO in stand-alone mode, distributed MinIO has a per tenant limit of minimum 2 and maximum 32 servers. There are no limits on number of disks shared across these servers. If you need a multiple tenant setup, you can easily spin up multiple MinIO instances managed by orchestration tools like Kubernetes. As with MinIO in stand-alone mode, distributed MinIO has a per tenant limit of minimum of 2 and maximum of 32 servers. There are no limits on number of disks across these servers. If you need a multiple tenant setup, you can easily spin up multiple MinIO instances managed by orchestration tools like Kubernetes, Docker Swarm etc.
Note that with distributed MinIO you can play around with the number of nodes and drives as long as the limits are adhered to. For example, you can have 2 nodes with 4 drives each, 4 nodes with 4 drives each, 8 nodes with 2 drives each, 32 servers with 24 drives each and so on. Note that with distributed MinIO you can play around with the number of nodes and drives as long as the limits are adhered to. For example, you can have 2 nodes with 4 drives each, 4 nodes with 4 drives each, 8 nodes with 2 drives each, 32 servers with 64 drives each and so on.
You can also use [storage classes](https://github.com/minio/minio/tree/master/docs/erasure/storage-class) to set custom data and parity distribution across total disks. You can also use [storage classes](https://github.com/minio/minio/tree/master/docs/erasure/storage-class) to set custom data and parity distribution per object.
### Consistency Guarantees ### Consistency Guarantees
MinIO follows strict **read-after-write** consistency model for all i/o operations both in distributed and standalone modes. MinIO follows strict **read-after-write** and **list-after-write** consistency model for all i/o operations both in distributed and standalone modes.
# Get started # Get started
If you're aware of stand-alone MinIO set up, the process remains largely the same, as the MinIO server automatically switches to stand-alone or distributed mode, depending on the command line parameters. If you're aware of stand-alone MinIO set up, the process remains largely the same. MinIO server automatically switches to stand-alone or distributed mode, depending on the command line parameters.
## 1. Prerequisites ## 1. Prerequisites
@ -43,25 +43,25 @@ To start a distributed MinIO instance, you just need to pass drive locations as
*Note* *Note*
- All the nodes running distributed MinIO need to have same access key and secret key for the nodes to connect. To achieve this, it is **mandatory** to export access key and secret key as environment variables, `MINIO_ACCESS_KEY` and `MINIO_SECRET_KEY`, on all the nodes before executing MinIO server command. - All the nodes running distributed MinIO need to have same access key and secret key for the nodes to connect. To achieve this, it is **mandatory** to export access key and secret key as environment variables, `MINIO_ACCESS_KEY` and `MINIO_SECRET_KEY`, on all the nodes before executing MinIO server command.
- All the nodes running distributed MinIO need to be in a homogeneous environment, i.e. same operating system, same number of disks and same interconnects. - All the nodes running distributed MinIO setup are recommended to be in homogeneous environment, i.e. same operating system, same number of disks and same network interconnects.
- `MINIO_DOMAIN` environment variable should be defined and exported if domain is needed to be set. - MinIO distributed mode requires fresh directories. If required, the drives can be shared with other applications. You can do this by using a sub-directory exclusive to MinIO. For example, if you have mounted your volume under `/export`, pass `/export/data` as arguments to MinIO server.
- MinIO distributed mode requires fresh directories. If required, the drives can be shared with other applications. You can do this by using a sub-directory exclusive to minio. For example, if you have mounted your volume under `/export`, pass `/export/data` as arguments to MinIO server.
- The IP addresses and drive paths below are for demonstration purposes only, you need to replace these with the actual IP addresses and drive paths/folders. - The IP addresses and drive paths below are for demonstration purposes only, you need to replace these with the actual IP addresses and drive paths/folders.
- Servers running distributed MinIO instances should be less than 3 seconds apart. You can use [NTP](http://www.ntp.org/) as a best practice to ensure consistent times across servers. - Servers running distributed MinIO instances should be less than 15 minutes apart. You can enable [NTP](http://www.ntp.org/) service as a best practice to ensure same times across servers.
- Running Distributed MinIO on Windows is experimental as of now. Please proceed with caution. - Running Distributed MinIO on Windows operating system is experimental. Please proceed with caution.
- `MINIO_DOMAIN` environment variable should be defined and exported if domain is needed to be set.
Example 1: Start distributed MinIO instance on 32 nodes with 32 drives each mounted at `/export1` to `/export32` (pictured below), by running this command on all the 32 nodes:
![Distributed MinIO, 32 nodes with 32 drives each](https://github.com/minio/minio/blob/master/docs/screenshots/Architecture-diagram_distributed_32.png?raw=true)
Example 1: Start distributed MinIO instance on 8 nodes with 1 disk each mounted at `/export1` (pictured below), by running this command on all the 8 nodes:
![Distributed MinIO, 8 nodes with 1 disk each](https://github.com/minio/minio/blob/master/docs/screenshots/Architecture-diagram_distributed_8.jpg?raw=true)
#### GNU/Linux and macOS #### GNU/Linux and macOS
```sh ```sh
export MINIO_ACCESS_KEY=<ACCESS_KEY> export MINIO_ACCESS_KEY=<ACCESS_KEY>
export MINIO_SECRET_KEY=<SECRET_KEY> export MINIO_SECRET_KEY=<SECRET_KEY>
minio server http://192.168.1.1{1...8}/export1 minio server http://host{1...32}/export{1...32}
``` ```
__NOTE:__ `{1...n}` shown have 3 dots! Using only 2 dots `{1..32}` will be interpreted by your shell and won't be passed to minio server, affecting the erasure coding order, which may impact performance and high availability. __Always use ellipses syntax `{1...n}` (3 dots!) for optimal erasure-code distribution__
__NOTE:__ `{1...n}` shown have 3 dots! Using only 2 dots `{1..4}` will be interpreted by your shell and won't be passed to minio server, affecting the erasure coding order, which may impact performance and high availability. __Always use `{1...n}` (3 dots!) to allow minio server to optimally erasure-code data__
## 3. Test your setup ## 3. Test your setup
To test this setup, access the MinIO server via browser or [`mc`](https://docs.min.io/docs/minio-client-quickstart-guide). To test this setup, access the MinIO server via browser or [`mc`](https://docs.min.io/docs/minio-client-quickstart-guide).

@ -1,47 +0,0 @@
# Large Bucket Support Quickstart Guide [![Slack](https://slack.min.io/slack?type=svg)](https://slack.min.io) [![Go Report Card](https://goreportcard.com/badge/minio/minio)](https://goreportcard.com/report/minio/minio) [![Docker Pulls](https://img.shields.io/docker/pulls/minio/minio.svg?maxAge=604800)](https://hub.docker.com/r/minio/minio/)
Erasure code in MinIO is limited to 16 disks. This on its own allows storage space to hold a tenant's data. However, to cater to cases which may need larger number of disks or high storage space requirements, large bucket support was introduced.
We call it large-bucket as it allows a single MinIO bucket to expand over multiple erasure code deployment sets. Without any special configuration it helps you create peta-scale storage systems. With large bucket support, you can use more than 16 disks upfront while deploying the MinIO server. Internally MinIO creates multiple smaller erasure coded sets, and these sets are further combined into a single namespace. This document gives a brief introduction on how to get started with large bucket deployments. To explore further on advanced uses and limitations, refer to the [design document](https://github.com/minio/minio/blob/master/docs/large-bucket/DESIGN.md).
## Get started
If you're aware of distributed MinIO setup, the installation and configuration remains the same. Only new addition to the input syntax is `...` convention to abbreviate the drive arguments. Remote drives in a distributed setup are encoded as HTTP(s) URIs which can be similarly abbreviated as well.
### 1. Prerequisites
Install MinIO - [MinIO Quickstart Guide](https://docs.min.io/docs/minio-quickstart-guide).
### 2. Run MinIO on many drives
We'll see examples on how to do this in the following sections.
*Note*
- All the nodes running distributed MinIO need to have same access key and secret key. To achieve this, we export access key and secret key as environment variables on all the nodes before executing MinIO server command.
- The drive paths below are for demonstration purposes only, you need to replace these with the actual drive paths/folders.
#### MinIO large bucket on multiple drives (standalone)
You'll need the path to the disks e.g. `/export1, /export2 .... /export24`. Then run the following commands on all the nodes you'd like to launch MinIO.
```sh
export MINIO_ACCESS_KEY=<ACCESS_KEY>
export MINIO_SECRET_KEY=<SECRET_KEY>
minio server /export{1...24}
```
#### MinIO large bucket on multiple servers (distributed)
You'll need the path to the disks e.g. `http://host1/export1, http://host2/export2 .... http://host4/export16`. Then run the following commands on all the nodes you'd like to launch MinIO.
```sh
export MINIO_ACCESS_KEY=<ACCESS_KEY>
export MINIO_SECRET_KEY=<SECRET_KEY>
minio server http://host{1...4}/export{1...16}
```
### 3. Test your setup
To test this setup, access the MinIO server via browser or [`mc`](https://docs.min.io/docs/minio-client-quickstart-guide). You’ll see the uploaded files are accessible from the all the MinIO endpoints.
## Explore Further
- [Use `mc` with MinIO Server](https://docs.min.io/docs/minio-client-quickstart-guide)
- [Use `aws-cli` with MinIO Server](https://docs.min.io/docs/aws-cli-with-minio)
- [Use `s3cmd` with MinIO Server](https://docs.min.io/docs/s3cmd-with-minio)
- [Use `minio-go` SDK with MinIO Server](https://docs.min.io/docs/golang-client-quickstart-guide)
- [The MinIO documentation website](https://docs.min.io)

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Loading…
Cancel
Save