项目作者: cmcconnell1

项目描述 :
6.x Elasticsearch Chef Cookbook with TLS/SSL, Kibana, and Cerbro for management, etc.
高级语言: Ruby
项目地址: git://github.com/cmcconnell1/cmcc_elasticsearch.git
创建时间: 2018-05-10T22:08:38Z
项目社区:https://github.com/cmcconnell1/cmcc_elasticsearch

开源协议:Other

下载


td_elasticsearch Elasticsearch

Requirements

  • Pre-existing secure and encrypted (via restricted bucket policy) S3 bucket with your pre-existing .key, .crt. .jks \
    (cerbro-truststore) files.
    • i.e.: copy them to a S3 bucket path like: terradatum-chef/certs/elasticsearch/dev1
  • This chef cookbook code uses the cluster name (‘dev1’ or ‘prod1’) to determine the correct PEM and keystore \
    files to fetch from the S3 bucket for the cluster.
    • i.e.:
      • terradatum-chef/certs/elasticsearch/prod1
      • terradatum-chef/certs/elasticsearch/dev1
  • The nodes should be launched in EC2 with requisite IAM role/policies (and S3 bucket policy) to facilitate access.
    • See below example Secure, Encrypted S3 bucket policy.

      Platforms

  • RHEL and derivatives

Chef

  • Chef >= 12.1

Usage—inclusion via roles

  • Elasticsearch
    • Prod:
      • include elasticsearch-prod role
    • Dev:
      • include elasticsearch-dev role
  • Kibana and Cerbro
    • Prod:
      • elasticsearch-prod, kibana-prod, cerebro-prod
    • Dev:
      • elasticsearch-dev, kibana-dev, cerebro-dev

Recipes

  • td_elasticsearch::default
  • td_elasticsearch::kibana
  • td_elasticsearch::cerebro
  • td_elasticsearch::snapshots

Cookbook and Recipe Overview; TL;DR

Quick high-level summary:

  • Chef Provisioning using TLS/SSL certs, keys, etc., for:

    • Elasticsearch with snapshots

      • Recipe: td_elasticsearch::default
      • Installed and configured in systemd.
      • Supports 6.x+ with x-pack TLS requirements.
        • Tested / validated on 6.2.2-1
      • Distributes and configures with (pre-existing) TLS/SSL certs, keys, etc. from requisite encrypted, secure \
        S3 bucket/path.
    • Elasticsearch snapshots via cron

      • Recipe: td_elasticsearch::snapshots
        • Manage via snapshots section in the attributes/default.rb (and/or via chef environment, roles, etc.) file.
    • Kibana (HTTPS/SSL) “Coordinating Node”

      • Recipe: td_elasticsearch::kibana
        • Installed and configured in systemd.
        • Post chef-client available at https URL—i.e.: https://kibana1.terradatum.com:5601
          • Redirects to kibana login/service; the kibana service terminates the TLS/SSL at the kibana app.
    • Cerbro (HTTPS/SSL) on the Kibana node

      • Recipe: td_elasticsearch::cerebro
        • Installed and configured in systemd.
        • Terminates TLS/SSL at the web service on port 443 and redirects to the cerebro play/java app running locally on port 9000
          • Post chef-client available at https URL—i.e.: https://kibana1.terradatum.com
            • Cerebro Redirect to Login page and use the following:
              • Node Address: ‘kibana.terradatum.com:9200’ (9200 is the elasticsearch port)
              • Username: use your elasticsearch credentials/account–i.e. ‘kibana,’ ‘cmcc’, etc.
              • Password: requisite password for the above elasticsearch login

Attributes

  • requisite attributes for all above recipes can be set in attributes or in the requisite environment, etc.

Notes, Caveats, Issues

Elasticsearch

  • We were forced to use a beta version/branch the version of which was not valid.
    regarding branch: “4.0.0-beta” both berks and chef-client died trying to handle it.

    1. {"error":["Invalid cookbook version '4.0.0-beta'."]}

    To move forward we took the beta branch and version and uploaded this version to our chef server and set to a valid version number.
    For now we used 4.0.0

  • Please review https://github.com/elastic/cookbook-elasticsearch/blob/4.0.0-beta/README.md

  • Elasticsearch did not seem capable of using our wildcard certs (signed from Digi) this was a time consuming frustrating exercise.
    • Highly recommend staying on the supported path of configuring/using self-signed CA certs.
  • The ES docs for TLS/SSL post 6.x were not complete and lacking many details when this work was done.
    • Many issues were encountered which required ES support to provide steps as new 6.x+ docs were lacking.
      • Thanks to their support for assisting.
  • Highly recommend using PEM format for certs!
    • We could not use PKCS#12 as this cert format does NOT work with remote curl commands.
    • On that note, currently Kibana must use PEM AFAIK.
  • This may have changed, but at that time we couldn’t use passphrase on PEM cert unless we included it in the \
    elasticsearch.yml file (docs did not state this). YMMV.

  • kitchen

    • Currently requires using Kitchen Vagrant due to required host name changes, etc.
    • Using kitchen docker has been unstable and unreliable for td_elasticsearch on my mac and therefore was forced \
      to stop using it—vagrant just works for me. In addition we need to take actions based on hostname, \
      which is difficult to do/support with Docker.
    • AWS TLS/SSL configuration, testing and validation really can’t be done locally so we do some but not all.
    • AWS EC2 ES cluster node discovery cannot be tested locally.

Pre node/cluster deployment requirements

  • Note: (see Elasticsearch, Kibana, and cerbro docs for more details on how to create/configure TLS/SSL certs, etc.)

    • There is quite a lot to understand, configure, and do to support TLS for ES—and is outside scope of this doc.
  • Update/modify the attributes/default.rb, scripts / templates, and license files, S3 bucket/path, PEM certs/keys, \
    keystores, user creds, etc., FOR YOUR ENVIRONMENT.

  • Encrypted S3 buckets with restricted access to only specified AWS Access keys (used by IAM roles).

    • Prod cluster: terradatum-chef/certs/elasticsearch/prod1
    • Dev cluster: terradatum-chef/certs/elasticsearch/dev1
    • Secure encrypted S3 Bucket Policy example (use at your discretion).
      • The policy below enforces encryption in transit, and only allows specified AWS access keys access.
        1. {
        2. "Version": "2012-10-17",
        3. "Statement": [
        4. {
        5. "Effect": "Deny",
        6. "Principal": "*",
        7. "Action": "s3:*",
        8. "Resource": [
        9. "arn:aws:s3:::terradatum-chef",
        10. "arn:aws:s3:::terradatum-chef/*"
        11. ],
        12. "Condition": {
        13. "StringNotLike": {
        14. "aws:userId": [
        15. "MYAWSACCESSKEYGOESHER",
        16. "XXOXOXOXOXXXOXXXOXXXX"
        17. ]
        18. }
        19. }
        20. },
        21. {
        22. "Effect": "Deny",
        23. "Principal": "*",
        24. "Action": "s3:*",
        25. "Resource": [
        26. "arn:aws:s3:::terradatum-chef",
        27. "arn:aws:s3:::terradatum-chef/*"
        28. ],
        29. "Condition": {
        30. "Bool": {
        31. "aws:SecureTransport": "false"
        32. }
        33. }
        34. }
        35. ]
        36. }
  • Requisite .crt, .key, .jks (cerebro) files for every node in respective ES cluster in requisite S3 bucket.

    • You must create these and copy to requisite S3 bucket and path before using these recipes.
    • i.e. for prod1 cluster:

      1. aws s3 ls s3://terradatum-chef/certs/elasticsearch/prod1/
      2. ca.key
      3. elastic1.crt
      4. elastic1.key
      5. elastic2.crt
      6. elastic2.key
      7. ...
      8. elastic7.crt
      9. elastic7.key
      10. kibana1.crt
      11. kibana1.key
      12. kibana2.crt
      13. kibana2.key
      14. kibana1-cerebro-truststore.jks
      15. prod-instances.yml

      cat prod-instances.yml (follow ES docs for process to create these correctly)

      1. instances:
      2. - name: 'elastic1'
      3. dns: [ 'elastic1.terradatum.com' ]
      4. - name: 'elastic2'
      5. dns: [ 'elastic2.terradatum.com' ]
      6. ...
      7. - name: 'elastic7'
      8. dns: [ 'elastic7.terradatum.com' ]
      9. - name: 'kibana1'
      10. dns: [ 'kibana1.terradatum.com' ]
      11. - name: 'kibana2'
      12. dns: [ 'kibana2.terradatum.com' ]

Launching EC2 ES cluster nodes with knife

DEV (dev1) ES cluster

We use the latest elaticsearch AMI (currently ami-e08e9a80)

The initial DEV cluster node JVMs will be configured with ~15.5G RAM available (Xmx, Xms), using the r4.xlarge
We may scale as needed.

deploy dev1 DEV cluster using existing custom AMI with 300 GB EBS volume for elasticsearch data
  1. for node in elastic{1..3}.dev; do knife ec2 server create --image ami-e08e9a80 -f r4.xlarge --region us-west-1 --subnet subnet-c3b87cab \
  2. -g "sg-ccf91aa3" -g "sg-0e609d6a" -g "sg-56e3ba33" -g "sg-01c54a64" --ssh-key td-aws-dev --ssh-user deploy \
  3. --identity-file "${HOME}/.ssh/td-aws-dev.pem" --node-name "${node}.terradatum.com" -r "role[elasticsearch-dev]" \
  4. --environment dev --fqdn "${node}.terradatum.com" --tags "ENV=DEV,Name=${node}.terradatum.com,APP=elasticsearch,Cluster=dev1" \
  5. --iam-profile chefAwsRole; done

STAGE cluster not used.

PROD (prod1) cluster

  • PROD cluster node JVMs will have ~30.5G RAM using the r4.2xlarge instance types.
    • Further testing will determine the correct value since literature has this number between 24G and 30.5G
Deploy using customized ES AMI providing a separate 300GB EBS data volume.
  1. for node in elastic{1..3}; do knife ec2 server create --image ami-e08e9a80 -f r4.2xlarge --region us-west-1 --subnet subnet-c4b87cac \
  2. -g "sg-a27ab3c7" -g "sg-a399f7da" -g "sg-56e3ba33" -g "sg-01c54a64" --ssh-key td-aws-ops --ssh-user deploy --identity-file "${HOME}/.ssh/td-aws-ops.pem" \
  3. --node-name "${node}.terradatum.com" -r "role[elasticsearch-prod]" --environment prod --fqdn "${node}.terradatum.com" \
  4. --tags "ENV=PROD,Name=${node}.terradatum.com,APP=elasticsearch,Cluster=prod1" --server-connect-attribute private_ip_address --iam-profile chefAwsRole; done
PROD (prod1 cluster) kibana node uses default public CentOS-7 latest image because it does NOT need the 300GB ES data volume.
  1. for node in kibana1; do knife ec2 server create --image ami-65e0e305 -f t2.large --region us-west-1 --subnet \
  2. subnet-c4b87cac -g "sg-a27ab3c7" -g "sg-a399f7da" -g "sg-56e3ba33" -g "sg-01c54a64" --ssh-key td-aws-ops \
  3. --ssh-user centos --identity-file "${HOME}/.ssh/td-aws-ops" --node-name "${node}.terradatum.com" \
  4. -r "role[kibana-prod]" --environment prod --fqdn "${node}.terradatum.com" --tags \
  5. "ENV=PROD,Name=${node}.terradatum.com,APP=elasticsearch,Cluster=prod1" --server-connect-attribute private_ip_address \
  6. --iam-profile chefAwsRole;
  7. done

Post chef knife bootstrap node creation and successful chef-client run

Run Setup passwords—NOTE the passwords for the accounts will be the same on all nodes in the cluster

Set password for users: elastic, kibana, and logstash_system accounts
OPTION1 use auto password generation

Commands shown for LOCAL, DEV, and PROD:

  1. /usr/share/elasticsearch/bin/x-pack/setup-passwords auto -u "https://elastic1.local:9200"
  2. /usr/share/elasticsearch/bin/x-pack/setup-passwords auto -u "https://elastic2.dev:9200"
  3. /usr/share/elasticsearch/bin/x-pack/setup-passwords auto -u "https://elastic3.terradatum.com:9200"
Set passwords
OPTION2 use interactive password creation (you set at command-line)

Commands shown for LOCAL, DEV, and PROD:

  1. /usr/share/elasticsearch/bin/x-pack/setup-passwords interactive -u "https://elastic1.local:9200"
  2. /usr/share/elasticsearch/bin/x-pack/setup-passwords interactive -u "https://elastic2.dev:9200"
  3. /usr/share/elasticsearch/bin/x-pack/setup-passwords interactive -u "https://elastic3.terradatum.com:9200"

Remember to update LastPass with the new passwords for dev and prod clusters

Validate secure curl with ca cert works

  1. curl --cacert /etc/elasticsearch/certs/ca.crt -u elastic 'https://elastic1.local:9200/_cat/nodes'
  2. curl --cacert /etc/elasticsearch/certs/ca.crt -u elastic 'https://elastic2.dev:9200/_cat/nodes'
  3. curl --cacert /etc/elasticsearch/certs/ca.crt -u elastic 'https://elastic3.terradatum.com:9200/_cat/nodes'
  4. # using elastic account and password with curl
  5. curl --cacert /etc/elasticsearch/certs/ca.crt -u elastic:XXXXXXXX 'https://elastic3.terradatum.com:9200/_cat/nodes'
  6. 10.1.0.60 1 57 0 0.00 0.13 0.14 mdi * elastic1
  7. 10.1.0.94 1 56 3 0.07 0.20 0.23 mdi - elastic3
  8. 10.1.0.196 2 57 0 0.13 0.16 0.20 mdi - elastic2

Export ES_USR, ES_PASSWD (with admin super/admin creds), and CURL_CA_BUNDLE for easier curl commands

Note: not the/a snapshot user, use a real admin/super user. The snapshot user can create snapshots is limited.
i.e.:

  1. export CURL_CA_BUNDLE="/etc/elasticsearch/certs/ca.crt"
  2. export ES_USR="my_admin_user"
  3. export ES_PASSWD="my_admin_passwd"
  4. curl -u $ES_USR:$ES_PASSWD -X GET "${ES_URL}/_cat/nodes"
  5. 10.1.0.60 7 59 0 0.00 0.01 0.05 mdi * elastic1
  6. 10.1.0.103 6 92 2 0.00 0.02 0.05 - - kibana1
  7. 10.1.0.94 6 61 0 0.00 0.01 0.05 mdi - elastic3
  8. 10.1.0.196 8 62 0 0.00 0.01 0.05 mdi - elastic2

Elasticsearch Snapshots

  • Configured on the target kibana coordinating (or other cluster) node by the td_elasticsearch::snapshots recipe.
  • Fully configurable via the snapshots section of the attributes/default.rb file.

snapshot validation post chef-client run

  1. [root@kibana1 ~]# ll /etc/cron.d | grep 'elastic-snapshots'
  2. -rw-r--r--. 1 root root 157 May 9 20:27 prod-create-elastic-snapshots
  3. -rw-r--r--. 1 root root 139 May 10 17:08 prod-rotate-elastic-snapshots
  4. [root@kibana1 ~]# cat /etc/cron.d/prod-create-elastic-snapshots
  5. # Generated by Chef. Changes will be overwritten.
  6. 1 9 * * * root /opt/td-elastic-utilties/create-elastic-snapshots.sh -n prod1 -r prod1 -s snapshot-nightly
  7. [root@kibana1 ~]# cat /etc/cron.d/prod-rotate-elastic-snapshots
  8. # Generated by Chef. Changes will be overwritten.
  9. 10 17 * * * root /opt/td-elastic-utilties/rotate-elastic-snapshots.sh -n prod1 -r prod1

validate snapshot limits enforced by our scripts from cron

In our case on prod we’ll keep 15 snapshots

  1. [root@kibana1 ~]# grep LIMIT /opt/td-elastic-utilties/prod1-cluster-vars
  2. # LIMIT must be >= 1
  3. export LIMIT=15
  4. [root@kibana1 ~]# source /opt/td-elastic-utilties/prod1-cluster-vars
  5. [root@kibana1 ~]# env | grep LIMIT
  6. LIMIT=15
  7. [root@kibana1 ~]# curl -u $ES_USR:$ES_PASSWD -X GET "${ES_URL}/_snapshot/prod1/_all?pretty" --silent | jq -r '.snapshots [] .snapshot'
  8. snapshot-bruins-20180508-110550
  9. snapshot-wild-20180508-110600
  10. snapshot-coyotes-20180508-121008
  11. snapshot-ducks-20180508-121022
  12. snapshot-from-kibana-test1-20180508-193648
  13. snapshot-whalers-20180508-125408
  14. snapshot-redwings-20180508-125945
  15. snapshot-sharks-20180508-130241
  16. snapshot-stars-20180508-131301
  17. snapshot-nightly-20180509-161401
  18. snapshot-nightly-20180509-161901
  19. snapshot-nightly-20180509-171901
  20. snapshot-nightly-20180509-181901
  21. snapshot-nightly-20180509-190101
  22. snapshot-nightly-20180510-090101

Licensing nodes/cluster

We have this automated but currently disabled and are just doing this simple step manually, the requisite license file is delivered to /var/tmp
Optionally, you could omit the password and enter it in interactively when you execute the command

  1. curl --cacert /etc/elasticsearch/certs/ca.crt -XPUT -u elastic:XXXXXXX 'https://elastic1.terradatum.com:9200/_xpack/license' -H 'Content-Type: application/json' -d @/var/tmp/terradatum-prod.json
  2. {"acknowledged":true,"license_status":"valid"}

Kibana coordinating nodes with requisite TLS/SSL certs, etc.

deploy kibana dev coordinating node

  1. export node=kibana1.dev
  2. knife ec2 server create --image ami-65e0e305 -f t2.medium --region us-west-1 --subnet subnet-c4b87cac -g sg-d334ccb6 \
  3. -g "sg-ccf91aa3" -g "sg-0e609d6a" -g "sg-56e3ba33" -g "sg-01c54a64" --ssh-key td-aws-dev --ssh-user deploy \
  4. --identity-file "${HOME}/.ssh/td-aws-dev.pem" --node-name "${node}.terradatum.com" -r "role[kibana-dev]" \
  5. --environment dev --fqdn "${node}.terradatum.com" --tags \
  6. "ENV=DEV,Name=${node}.terradatum.com,APP=elasticsearch,Cluster=dev1" --iam-profile chefAwsRole; done

deploy prod kibana coordinating node(s) using default community CentOS AMI

  1. for node in kibana{1..2}; do
  2. knife ec2 server create --image ami-65e0e305 -f t2.large --region us-west-1 --subnet subnet-c4b87cac \
  3. -g "sg-a27ab3c7" -g "sg-a399f7da" -g "sg-56e3ba33" -g "sg-01c54a64" --ssh-key td-aws-ops --ssh-user \
  4. centos --identity-file "${HOME}/.ssh/td-aws-ops" --node-name "${node}.terradatum.com" -r "role[kibana-prod]" \
  5. --environment prod --fqdn "${node}.terradatum.com" --tags \
  6. "ENV=PROD,Name=${node}.terradatum.com,APP=elasticsearch,Cluster=prod1" \
  7. --server-connect-attribute private_ip_address --iam-profile chefAwsRole;
  8. done

Trust your cluster’s CA TLS/SSL Certificate (that you created with the certs and keys) on OSX/MACOS

  • If not on OSX/MACOS, use appropriate alternatives here.
    • For OSX/MACOS Go to keychain
      • Trust the ca.crt (available from either LastPass or in /etc/elasticsearch/certs on cluster nodes) in your keychain
        • File - Import
        • Import the ca.crt
        • Open Trust - and under SSL click always trust

LOCAL development using Chef Kitchen (limited functionality but good for baseline / minimal testing)

  1. configure aliases for easy kitchen use
  2. alias elastic1="kitchen login elastic1-centos-7"
  3. alias elastic2="kitchen login elastic2-centos-7"
  4. alias elastic3="kitchen login elastic3-centos-7"
  5. alias kibana1="kitchen login kibana1-centos-7"

deploy two elasticsearch suites (two nodes for testing)

  1. kitchen converge 'elastic1|elastic2' -c

deploy two elastic suites/nodes and one kibana

  1. kitchen converge 'elastic1|elastic2|kibana1' -c
  2. -----> Starting Kitchen (v1.19.2)
  3. -----> Creating <elastic1-centos-7>...
  4. -----> Creating <elastic2-centos-7>...
  5. -----> Creating <kibana1-centos-7>...
  1. for i in elastic{1..3}; do kitchen exec $i -c "hostname -I"; done
  2. -----> Execute command on elastic1-centos-7.
  3. 10.0.2.15 172.28.128.6
  4. -----> Execute command on elastic2-centos-7.
  5. 10.0.2.15 172.28.128.7
  6. -----> Execute command on elastic3-centos-7.
  7. 10.0.2.15 172.28.128.8

kitchen.yml file

  1. concurrency: 2
  2. driver:
  3. name: vagrant
  4. cachier: true
  5. use_sudo: false
  6. privileged: true
  7. provisioner:
  8. name: chef_zero
  9. log_level: info
  10. always_update_cookbooks: true
  11. verifier:
  12. name: inspec
  13. format: documentation
  14. platforms:
  15. - name: centos-7
  16. suites:
  17. - name: elastic1
  18. driver:
  19. network:
  20. - ["private_network", { type: "dhcp" }]
  21. vm_hostname: elastic1.local
  22. customize:
  23. name: elastic1
  24. memory: 2048
  25. cpus: 2
  26. synced_folders:
  27. - ["/var/tmp/data", "/home/vagrant/data", "create: true, type: :nfs"]
  28. run_list:
  29. - recipe[td_elasticsearch::default]
  30. - name: elastic2
  31. driver:
  32. network:
  33. - ["private_network", { type: "dhcp" }]
  34. vm_hostname: elastic2.local
  35. customize:
  36. name: elastic2
  37. memory: 2048
  38. cpus: 2
  39. synced_folders:
  40. - ["/var/tmp/data", "/home/vagrant/data", "create: true, type: :nfs"]
  41. run_list:
  42. - recipe[td_elasticsearch::default]
  43. - name: elastic3
  44. driver:
  45. network:
  46. - ["private_network", { type: "dhcp" }]
  47. vm_hostname: elastic3.local
  48. customize:
  49. name: elastic3
  50. memory: 2048
  51. cpus: 2
  52. synced_folders:
  53. - ["/var/tmp/data", "/home/vagrant/data", "create: true, type: :nfs"]
  54. run_list:
  55. - recipe[td_elasticsearch::default]
  56. - name: kibana1
  57. driver:
  58. network:
  59. - ["private_network", { type: "dhcp" }]
  60. vm_hostname: kibana1.local
  61. customize:
  62. name: kibana1
  63. memory: 2048
  64. cpus: 2
  65. synced_folders:
  66. - ["/var/tmp/data", "/home/vagrant/data", "create: true, type: :nfs"]
  67. run_list:
  68. - recipe[td_elasticsearch::default]
  69. - recipe[td_elasticsearch::kibana]