项目作者: PaladinStudiosBVs

项目描述 :
Ridiculously easy to use tool for populating Mongo databases.
高级语言: Python
项目地址: git://github.com/PaladinStudiosBVs/mongo-populator.git
创建时间: 2017-01-12T19:59:08Z
项目社区:https://github.com/PaladinStudiosBVs/mongo-populator

开源协议:GNU General Public License v3.0

下载


Mongo Populator

Mongo Populator is a tool that effortlessly populates a Mongo database with a dump that was extracted from somewhere else.
You can either use a local dump, a dump from another Mongo database or a dump located in Amazon S3.

  • Supported sources: local directory, local database (dockerized or not), remote database via SSH (dockerized or
    not), Amazon S3 bucket.
  • Supported destinations: local database (dockerized or not), remote database via SSH (dockerized or not),
    Amazon S3 bucket.

Disclaimer: this is still under heavy development, so use it at your own risk!

Installation

In order to install Mongo Populator, follow these steps:

  1. git clone https://github.com/PaladinStudiosBVs/mongo-populator.git
  2. (optional) make tests
  3. sudo make install
  4. Note for macports users: the executable file mongo-populator will be copied to
    /opt/local/Library/Frameworks/Python.framework/Versions/Current/bin/. So unless you have
    this directory in your PATH, you will have to create a symbolic link inside some directory
    in your PATH to the executable in the former directory.

Compatibility notes

Mongo Populator is supposed to work with Python 3.3+. If you want your version of Python to be
supported, feel free to contribute to the project.

Usage

Here are some examples of the supported use cases. I will be showing how to do it with command-line options and with
configuration file properties. I assume you will then be able to do the same with environment variables.

From a dump in a local directory to a local Mongo database.

Command-line options

  1. mongo-populator --source-use-local-dump \
  2. --source-dump-dir /path/to/dump/directory \
  3. --destination-use-local-db \
  4. --destination-db-name <db-name> \
  5. [--destination-db-user <db-user> \]
  6. [--destination-db-password <db-password> \]
  7. [--destination-db-restore-indexes \]
  8. [--destination-drop-db]

Properties in configuration file

  1. source_use_local_dump = True
  2. source_dump_dir = /path/to/local/dump/dir
  3. destination_db_name = test_db
  4. destination_db_user = test_user
  5. destination_db_password = test_password
  6. # mongorestore will use --noIndexRestore
  7. destination_db_restore_indexes = False
  8. # mongorestore will use --drop option
  9. destination_drop_db = True
  10. destination_use_local_db = True

From a dump in a local directory to a remote Mongo database (via SSH)

Command-line options

  1. mongo-populator --source-use-local-dump \
  2. --source-dump-dir /path/to/dump/directory \
  3. --destination-use-ssh \
  4. --destination-db-name <db-name> \
  5. [--destination-db-user <db-user> \]
  6. [--destination-db-password <db-password> \]
  7. [--destination-db-restore-indexes \]
  8. [--destination-drop-db \]
  9. --destination-ssh-host <host> \
  10. --destination-ssh-user <user> \
  11. [--destintion-ssh-password <password> \]
  12. [--destination-ssh-key-file <file>]

Properties in configuration file

  1. source_use_local_dump = True
  2. source_dump_dir = /path/to/local/dump/dir
  3. destination_db_name = test_db
  4. destination_db_user = test_user
  5. destination_db_password = test_password
  6. # mongorestore will use --noIndexRestore
  7. destination_db_restore_indexes = False
  8. # mongorestore will use --drop option
  9. destination_drop_db = True
  10. destination_use_ssh = True
  11. destination_ssh_host = 127.0.0.1
  12. destination_ssh_user = ubuntu
  13. # Password can be empty, as long as you have a key file or
  14. # you have an authorized key pair
  15. destination_ssh_password =
  16. destination_ssh_key_file = /path/to/key_file.pem

Note that if you specify a password, most likely you won’t need to specify the identity file. The same goes for if you specify
an identity file, then you won’t have to specify the password. Also if you have an authorized key pair, then you won’t have
to specify neither the password or the identity file.

From a remote Mongo database (via SSH) inside a Docker container to a remote Mongo database (via SSH)

Command-line options

  1. mongo-populator --source-use-ssh \
  2. --source-ssh-host 123.123.123.123 \
  3. --source-ssh-user some_user \
  4. [--source-ssh-password <password> \]
  5. [--source-ssh-key-file /path/to/source/key_file.pem \]
  6. --source-db-name source_db \
  7. [--source-db-user source_user \]
  8. [--source-db-password source_password \]
  9. --source-is-dockerized \
  10. --source-docker-container-name test_mongo \
  11. --destination-use-ssh \
  12. --destination-ssh-host 127.0.0.1 \
  13. --destination-ssh-user ubuntu \
  14. [--destination-ssh-password <password> \]
  15. [--destination-ssh-key-file /path/to/key_file.pem \]
  16. [--destination-drop-db \]
  17. --destination-db-name test_db \
  18. [--destination-db-user test_user \]
  19. [--destination-db-password test_password \]
  20. [--destination-db-restore-indexes \]
  21. [--destination-drop-db \]

Properties in configuration file

  1. source_db_name = source_db
  2. source_db_user = source_user
  3. source_db_password = source_password
  4. source_use_ssh = True
  5. source_ssh_host = 123.123.123.123
  6. source_ssh_user = some_user
  7. source_ssh_password =
  8. source_ssh_key_file = /path/to/source/key_file.pem
  9. source_is_dockerized = True
  10. source_docker_container_name = test_mongo
  11. destination_db_name = test_db
  12. destination_db_user = test_user
  13. destination_db_password = test_password
  14. # mongorestore will use --noIndexRestore
  15. destination_db_restore_indexes = False
  16. # mongorestore will use --drop option
  17. destination_drop_db = True
  18. destination_use_ssh = True
  19. destination_ssh_host = 127.0.0.1
  20. destination_ssh_user = ubuntu
  21. # Password can be empty, as long as you have a key file or
  22. # you have an authorized key pair
  23. destination_ssh_password =
  24. destination_ssh_key_file = /path/to/key_file.pem

From an Amazon S3 bucket to a local Mongo database

Note that if you have previously configured your AWS credentials (for example, using aws configure), then
you don’t have to specify the access key id nor the secret access key (or even the region name).

Command-line options

  1. mongo-populator --source-use-s3 \
  2. [--source-s3-access-key-id <s3-access-key-id> \]
  3. --source s3-secret-access-key <s3-secret-access-key> \
  4. --source-s3-region-name <s3-region-name> \
  5. --source-s3-bucket <s3-bucket-name> \
  6. --source-s3-prefix <s3-prefix> \
  7. --destination-use-local-db \
  8. --destination-db-name <db-name> \
  9. [--destination-db-user <db-user> \]
  10. [--destination-db-password <db-password> \]
  11. [--destination-db-restore-indexes \]
  12. [--destination-drop-db]

Properties in configuration file

  1. source_use_s3 = True
  2. source_s3_access_key_id = access_key_id
  3. source_s3_secret_access_key = secret_access_key
  4. source_s3_region_name = region_name (e.g. eu-west-1)
  5. source_s3_bucket = bucket_name
  6. source_s3_prefix = some_prefix
  7. destination_db_name = test_db
  8. destination_db_user = test_user
  9. destination_db_password = test_password
  10. # mongorestore will use --noIndexRestore
  11. destination_db_restore_indexes = False
  12. # mongorestore will use --drop option
  13. destination_drop_db = True
  14. destination_use_local_db = True

From a remote Mongo database (via SSH) to an Amazon S3 bucket

Note that if you specify an s3 prefix, files will be stored under s3-bucket/prefix/%Y%m%d-%H%M%S/destination_db_name/.

Command-line options

  1. mongo-populator --source-use-ssh \
  2. --source-ssh-host 123.123.123.123 \
  3. --source-ssh-user some_user \
  4. [--source-ssh-password <password> \]
  5. [--source-ssh-key-file /path/to/source/key_file.pem \]
  6. --source-db-name source_db \
  7. [--source-db-user source_user \]
  8. [--source-db-password source_password \]
  9. --destination-db-name test_db \
  10. --destination-use-s3 \
  11. --destination-s3-access-key-id access_key_id \
  12. --destination-s3-secret-access-key secret_access_key \
  13. --destination-s3-region eu-west-1 \
  14. --destination-s3-bucket bucket_name \
  15. --destination-s3-prefix some_prefix

Properties in configuration file

  1. source_db_name = source_db
  2. source_db_user = source_user
  3. source_db_password = source_password
  4. source_use_ssh = True
  5. source_ssh_host = 123.123.123.123
  6. source_ssh_user = some_user
  7. source_ssh_password =
  8. source_ssh_key_file = /path/to/source/key_file.pem
  9. destination_db_name = test_db
  10. destination_use_s3 = True
  11. destination_s3_access_key_id = access_key_id
  12. destination_s3_secret_access_key = secret_access_key
  13. destination_s3_region = region_name (e.g. eu-west-1)
  14. destination_s3_bucket = bucket_name
  15. destination_s3_prefix = some_prefix

Command-line options

Here is a full list of command-line options:

  1. -h, --help show this help message and exit
  2. -v, --verbose verbose mode (-vvv for more)
  3. Source:
  4. --source-db-name SOURCE_DB_NAME
  5. Name of the local source Database (default: None)
  6. --source-db-user SOURCE_DB_USER
  7. User to connect to source database (default: None)
  8. --source-db-password SOURCE_DB_PASSWORD
  9. Password to connect to source database (default: None)
  10. --source-use-local-db Indicates if you want to use a local database or not
  11. (default: False)
  12. --source-use-local-dump Indicates if you want to use a local dump or not
  13. (default: False)
  14. --source-dump-dir SOURCE_DUMP_DIR
  15. Directory where the source dump is located (default: None)
  16. --source-tmp-dir SOURCE_TMP_DIR
  17. Directory where source dumps will be copied to
  18. (default: ~/.mongo-populator/tmp)
  19. --source-use-ssh Indicates if you want to connect to source DB via SSH
  20. (default: False)
  21. --source-ssh-host SOURCE_SSH_HOST
  22. SSH host we're connecting to if we decide to use SSH
  23. for the source (default: 127.0.0.1)
  24. --source-ssh-user SOURCE_SSH_USER
  25. SSH user to connect to source (default: None)
  26. --source-ssh-password SOURCE_SSH_PASSWORD
  27. SSH password to connect to source (default: None)
  28. --source-ssh-key-file SOURCE_SSH_KEY_FILE
  29. SSH identity file to use to connect to host (default: None)
  30. --source-is-dockerized Indicates whether the source database is running
  31. inside Docker or not. (default: False)
  32. --source-docker-container-name SOURCE_DOCKER_CONTAINER_NAME
  33. The name of the Docker container where the database is
  34. running (default: None)
  35. --source-use-s3 Retrieve source dump from an Amazon S3 bucket
  36. (default: False)
  37. --source-s3-access-key-id SOURCE_S3_ACCESS_KEY_ID
  38. Access key to the Amazon S3 bucket (default: None)
  39. --source-s3-secret-access-key SOURCE_S3_SECRET_ACCESS_KEY
  40. Secret access key to the Amazon S3 bucket (default: None)
  41. --source-s3-region-name SOURCE_S3_REGION_NAME
  42. Region used by the Amazon S3 bucket (e.g. eu-west-1)
  43. (default: None)
  44. --source-s3-bucket SOURCE_S3_BUCKET
  45. Amazon S3 bucket where the dump is stored (default: None)
  46. --source-s3-prefix SOURCE_S3_PREFIX
  47. Prefix to be use when fetching objects from the S3
  48. bucket (default: None)
  49. Destination:
  50. --destination-db-name DESTINATION_DB_NAME
  51. Name of the local destination Database (default: None)
  52. --destination-db-user DESTINATION_DB_USER
  53. User to connect to destination database (default: None)
  54. --destination-db-password DESTINATION_DB_PASSWORD
  55. Password to connect to destination database (default: None)
  56. --destination-db-restore-indexes Indicates whether you want to restore indexes from the
  57. dump or not (default: False)
  58. --destination-drop-db Indicates whether you want to drop the destination
  59. database (default: False)
  60. --destination-use-local-db Indicates whether you want to restore a local
  61. database. (default: False)
  62. --destination-use-ssh Indicates if you want to connect via SSH to
  63. destination database. (default: False)
  64. --destination-ssh-host DESTINATION_SSH_HOST
  65. SSH host we're connecting to if we decide to use SSH
  66. for the source (default: 127.0.0.1)
  67. --destination-ssh-user DESTINATION_SSH_USER
  68. SSH user to connect to destination (default: None)
  69. --destination-ssh-password DESTINATION_SSH_PASSWORD
  70. SSH password to connect to destination (default: None)
  71. --destination-ssh-key-file DESTINATION_SSH_KEY_FILE
  72. SSH identity file to use to connect to host (default: None)
  73. --destination-use-s3 Store dump in an Amazon S3 bucket (default: False)
  74. --destination-s3-access-key-id DESTINATION_S3_ACCESS_KEY_ID
  75. Access key to the Amazon S3 bucket (default: None)
  76. --destination-s3-secret-access-key DESTINATION_S3_SECRET_ACCESS_KEY
  77. Secret access key to the Amazon S3 bucket (default: None)
  78. --destination-s3-region-name DESTINATION_S3_REGION_NAME
  79. Region used by the Amazon S3 bucket (e.g. eu-west-1) (default: None)
  80. --destination-s3-bucket DESTINATION_S3_BUCKET
  81. Amazon S3 bucket where the dump will stored (default: None)
  82. --destination-s3-prefix DESTINATION_S3_PREFIX
  83. Prefix to be used when storing objects in the S3 bucket (default: None)

Environment variables

Instead of providing command-line options, you can define environment variables with the desired values. Note that
command-line options have the highest priority, which means that if you provide them, the corresponding values will
be used instead. Here is a list of available environment variables that you can define:

  1. MONGO_POPULATOR_SOURCE_DB_NAME
  2. MONGO_POPULATOR_SOURCE_DB_USER
  3. MONGO_POPULATOR_SOURCE_DB_PASSWORD
  4. MONGO_POPULATOR_SOURCE_USE_LOCAL_DB
  5. MONGO_POPULATOR_SOURCE_USE_LOCAL_DUMP
  6. MONGO_POPULATOR_SOURCE_DUMP_DIR
  7. MONGO_POPULATOR_SOURCE_TMP_DIR
  8. MONGO_POPULATOR_SOURCE_USE_SSH
  9. MONGO_POPULATOR_SOURCE_SSH_HOST
  10. MONGO_POPULATOR_SOURCE_SSH_USER
  11. MONGO_POPULATOR_SOURCE_SSH_PASSWORD
  12. MONGO_POPULATOR_SOURCE_SSH_KEY_FILE
  13. MONGO_POPULATOR_SOURCE_IS_DOCKERIZED
  14. MONGO_POPULATOR_SOURCE_DOCKER_CONTAINER_NAME
  15. MONGO_POPULATOR_SOURCE_USE_S3
  16. MONGO_POPULATOR_SOURCE_S3_ACCESS_KEY_ID
  17. MONGO_POPULATOR_SOURCE_S3_SECRET_ACCESS_KEY
  18. MONGO_POPULATOR_SOURCE_S3_REGION_NAME
  19. MONGO_POPULATOR_SOURCE_S3_BUCKET
  20. MONGO_POPULATOR_SOURCE_S3_PREFIX
  21. MONGO_POPULATOR_DESTINATION_DB_NAME
  22. MONGO_POPULATOR_DESTINATION_DB_USER
  23. MONGO_POPULATOR_DESTINATION_DB_PASSWORD
  24. MONGO_POPULATOR_DESTINATION_DROP_DB
  25. MONGO_POPULATOR_DESTINATION_DB_RESTORE_INDEXES
  26. MONGO_POPULATOR_DESTINATION_USE_LOCAL_DB
  27. MONGO_POPULATOR_DESTINATION_USE_SSH
  28. MONGO_POPULATOR_DESTINATION_SSH_HOST
  29. MONGO_POPULATOR_DESTINATION_SSH_USER
  30. MONGO_POPULATOR_DESTINATION_SSH_PASSWORD
  31. MONGO_POPULATOR_DESTINATION_SSH_KEY_FILE
  32. MONGO_POPULATOR_DESTINATION_USE_S3
  33. MONGO_POPULATOR_DESTINATION_S3_ACCESS_KEY_ID
  34. MONGO_POPULATOR_DESTINATION_S3_SECRET_ACCESS_KEY
  35. MONGO_POPULATOR_DESTINATION_S3_REGION_NAME
  36. MONGO_POPULATOR_DESTINATION_S3_BUCKET
  37. MONGO_POPULATOR_DESTINATION_S3_PREFIX
  38. MONGO_POPULATOR_FORCE_COLOR
  39. MONGO_POPULATOR_NOCOLOR
  40. MONGO_POPULATOR_COLOR_HIGHLIGHT
  41. MONGO_POPULATOR_COLOR_VERBOSE
  42. MONGO_POPULATOR_COLOR_WARN
  43. MONGO_POPULATOR_COLOR_ERROR
  44. MONGO_POPULATOR_COLOR_DEBUG
  45. MONGO_POPULATOR_COLOR_DEPRECATE
  46. MONGO_POPULATOR_COLOR_SKIP
  47. MONGO_POPULATOR_COLOR_UNREACHABLE
  48. MONGO_POPULATOR_COLOR_OK
  49. MONGO_POPULATOR_COLOR_CHANGED
  50. MONGO_POPULATOR_COLOR_DIFF_ADD
  51. MONGO_POPULATOR_COLOR_DIFF_REMOVE
  52. MONGO_POPULATOR_COLOR_DIFF_LINES

Configuration file

For the sake of convenience, you can have a configuration file with your desired values instead of passing command-line
options all the time. mongo-populator will first look up for a file called mongo-populator.cfg in the current working
directory. If it doesn’t find, then it will try to locate ~/.mongo-populator.cfg. On the event of not finding, it will
try to locate /etc/mongo-populator/mongo-populator.cfg. If none of these exist, then it will use default values.
Here is an example of a configuration file:

  1. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  2. # Unless you are using a local directory with a dump or an Amazon S3
  3. # bucket, you'll have to fill in these. The values should be either of
  4. # your local source database or your local
  5. source_db_name = test_db
  6. source_db_user = test_user
  7. source_db_password = test_password
  8. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  9. # Set this to True if you intend to extract a dump from a database running
  10. # locally.
  11. source_use_local_db = True
  12. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  13. # If you want to use a dump in a local directory instead, set this to True
  14. # and change source_dump_dir accordingly, with a path to the dump directory.
  15. #source_use_local_dump = True
  16. #source_dump_dir = /path/to/dump/directory
  17. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  18. # When extracting a dump from a database, mongo-populator stores it locally.
  19. # Use this property to specify where dumps should be stored or leave it as is.
  20. # A new dump from a database called xpto will exist in ~/.mongo-populator/tmp/%Y%m%d-%H%M%S/xpto
  21. source_tmp_dir = ~/.mongo-populator/tmp
  22. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  23. # Set source_use_ssh to True in case you need to access your source database
  24. # via SSH. You should also fill in the following properties with the correct values.
  25. # If you specify a key file, most likely you won't need to specify the password,
  26. # and vice-versa.
  27. #source_use_ssh = False
  28. #source_ssh_host = 127.0.0.1
  29. #source_ssh_user =
  30. #source_ssh_password =
  31. #source_ssh_key_file =
  32. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  33. # Sometimes, the source database is running inside a docker container.
  34. # In such situation, mongodump must be executed inside the container and
  35. # the output directory must be copied from the container to the host.
  36. source_is_dockerized = True
  37. source_docker_container_name = test_db_container
  38. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  39. # In case you want to populate a Mongo database with a dump stored
  40. # in Amazon S3. Note that if you have aws command-line tools and if you
  41. # have configured your credentials using `aws configure`, then you don't
  42. # need to fill in the access_key_id, the secret_access_key and the region,
  43. # as long as your credentials give you access to the bucket.
  44. #source_use_s3 = False
  45. #source_s3_access_key_id =
  46. #source_s3_secret_access_key =
  47. #source_s3_region =
  48. #source_s3_bucket =
  49. #source_s3_prefix =
  50. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  51. destination_db_name = some_db_name
  52. destination_db_user = some_db_user
  53. destination_db_password = some_db_password
  54. destination_db_restore_indexes = False
  55. destination_drop_db = True
  56. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  57. #destination_use_local_db = False
  58. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  59. destination_use_ssh = True
  60. destination_ssh_host = 123.123.123.123
  61. destination_ssh_user = ubuntu
  62. destination_ssh_password =
  63. destination_ssh_key_file = /path/to/key.pem
  64. #;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
  65. #destination_use_s3 = False
  66. #destination_s3_access_key_id =
  67. #destination_s3_secret_access_key =
  68. #destination_s3_region =
  69. #destination_s3_bucket =
  70. #destination_s3_prefix =
  71. # set to 1 if you don't want colors, or export MONGO_POPULATOR_NOCOLOR=1
  72. #nocolor = 1
  73. [colors]
  74. highlight = white
  75. verbose = blue
  76. warn = bright purple
  77. error = red
  78. debug = dark gray
  79. deprecate = purple
  80. skip = cyan
  81. unreachable = red
  82. ok = green
  83. changed = yellow
  84. diff_add = green
  85. diff_remove = red
  86. diff_lines = cyan

TODO

  • Add the ability to specify a configuration file as a command-line argument. Something like $ mongo-populator /path/to/file.cfg
  • Allow custom temporary directory in remote hosts. Right now, by default it stores dumps inside /tmp/mongodumps/
  • Add proper logging (useful if I’m running mongo-populator as a cron job)
  • Improve tests :}

License

Click on the Link to see the full text.