cách di chuyển dữ liệu tìm kiếm đàn hồi từ máy chủ này sang máy chủ khác

Question 1

Làm cách nào để di chuyển dữ liệu Elasticsearch từ máy chủ này sang máy chủ khác?

Tôi có máy chủ A đang chạy Elasticsearch 1.1.1 trên một nút cục bộ với nhiều chỉ số. Tôi muốn sao chép dữ liệu đó sang máy chủ B đang chạy Elasticsearch 1.3.4

Thủ tục cho đến nay

Tắt ES trên cả hai máy chủ và
quét tất cả dữ liệu đến đúng dir dữ liệu trên máy chủ mới. (dữ liệu dường như được đặt tại / var / lib /asticsearch / trên các hộp debian của tôi)
thay đổi quyền và quyền sở hữu đối vớiasticsearch :asticsearch
khởi động máy chủ ES mới

Khi tôi nhìn vào cụm với plugin đầu ES, không có chỉ số nào xuất hiện.

Có vẻ như dữ liệu không được tải. Tui bỏ lỡ điều gì vậy?

Question 2

Câu trả lời được chọn làm cho nó có vẻ phức tạp hơn một chút, sau đây là những gì bạn cần (cài đặt npm trước trên hệ thống của bạn).

npm install -g elasticdump
elasticdump --input=http://mysrc.com:9200/my_index --output=http://mydest.com:9200/my_index --type=mapping
elasticdump --input=http://mysrc.com:9200/my_index --output=http://mydest.com:9200/my_index --type=data

Bạn có thể bỏ qua lệnh co giãn đầu tiên cho các bản sao tiếp theo nếu ánh xạ không đổi.

Tôi vừa thực hiện quá trình di chuyển từ AWS sang Qbox.io ở trên mà không gặp bất kỳ sự cố nào.

Thông tin chi tiết tại:

https://www.npmjs.com/package/elasticdump

Trang trợ giúp (tính đến tháng 2 năm 2016) được bao gồm để hoàn thiện:

elasticdump: Import and export tools for elasticsearch

Usage: elasticdump --input SOURCE --output DESTINATION [OPTIONS]

--input
                    Source location (required)
--input-index
                    Source index and type
                    (default: all, example: index/type)
--output
                    Destination location (required)
--output-index
                    Destination index and type
                    (default: all, example: index/type)
--limit
                    How many objects to move in bulk per operation
                    limit is approximate for file streams
                    (default: 100)
--debug
                    Display the elasticsearch commands being used
                    (default: false)
--type
                    What are we exporting?
                    (default: data, options: [data, mapping])
--delete
                    Delete documents one-by-one from the input as they are
                    moved.  Will not delete the source index
                    (default: false)
--searchBody
                    Preform a partial extract based on search results
                    (when ES is the input,
                    (default: '{"query": { "match_all": {} } }'))
--sourceOnly
                    Output only the json contained within the document _source
                    Normal: {"_index":"","_type":"","_id":"", "_source":{SOURCE}}
                    sourceOnly: {SOURCE}
                    (default: false)
--all
                    Load/store documents from ALL indexes
                    (default: false)
--bulk
                    Leverage elasticsearch Bulk API when writing documents
                    (default: false)
--ignore-errors
                    Will continue the read/write loop on write error
                    (default: false)
--scrollTime
                    Time the nodes will hold the requested search in order.
                    (default: 10m)
--maxSockets
                    How many simultaneous HTTP requests can we process make?
                    (default:
                      5 [node <= v0.10.x] /
                      Infinity [node >= v0.11.x] )
--bulk-mode
                    The mode can be index, delete or update.
                    'index': Add or replace documents on the destination index.
                    'delete': Delete documents on destination index.
                    'update': Use 'doc_as_upsert' option with bulk update API to do partial update.
                    (default: index)
--bulk-use-output-index-name
                    Force use of destination index name (the actual output URL)
                    as destination while bulk writing to ES. Allows
                    leveraging Bulk API copying data inside the same
                    elasticsearch instance.
                    (default: false)
--timeout
                    Integer containing the number of milliseconds to wait for
                    a request to respond before aborting the request. Passed
                    directly to the request library. If used in bulk writing,
                    it will result in the entire batch not being written.
                    Mostly used when you don't care too much if you lose some
                    data when importing but rather have speed.
--skip
                    Integer containing the number of rows you wish to skip
                    ahead from the input transport.  When importing a large
                    index, things can go wrong, be it connectivity, crashes,
                    someone forgetting to `screen`, etc.  This allows you
                    to start the dump again from the last known line written
                    (as logged by the `offset` in the output).  Please be
                    advised that since no sorting is specified when the
                    dump is initially created, there's no real way to
                    guarantee that the skipped rows have already been
                    written/parsed.  This is more of an option for when
                    you want to get most data as possible in the index
                    without concern for losing some rows in the process,
                    similar to the `timeout` option.
--inputTransport
                    Provide a custom js file to us as the input transport
--outputTransport
                    Provide a custom js file to us as the output transport
--toLog
                    When using a custom outputTransport, should log lines
                    be appended to the output stream?
                    (default: true, except for `$`)
--help
                    This page

Examples:

# Copy an index from production to staging with mappings:
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=mapping
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=http://staging.es.com:9200/my_index \
  --type=data

# Backup index data to a file:
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=/data/my_index_mapping.json \
  --type=mapping
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=/data/my_index.json \
  --type=data

# Backup and index to a gzip using stdout:
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=$ \
  | gzip > /data/my_index.json.gz

# Backup ALL indices, then use Bulk API to populate another ES cluster:
elasticdump \
  --all=true \
  --input=http://production-a.es.com:9200/ \
  --output=/data/production.json
elasticdump \
  --bulk=true \
  --input=/data/production.json \
  --output=http://production-b.es.com:9200/

# Backup the results of a query to a file
elasticdump \
  --input=http://production.es.com:9200/my_index \
  --output=query.json \
  --searchBody '{"query":{"term":{"username": "admin"}}}'

------------------------------------------------------------------------------
Learn more @ https://github.com/taskrabbit/elasticsearch-dump`enter code here`

Question 3

Sử dụng ElasticDump

1) yum cài đặt epel-release

2) yum cài đặt nodejs

3) yum cài đặt npm

4) npm cài đặt đàn hồi

5) cd node_modules /asticdump / bin

6)

./elasticdump \

  --input=http://192.168.1.1:9200/original \

  --output=http://192.168.1.2:9200/newCopy \

  --type=data

Question 4

Bạn có thể sử dụng tính năng chụp nhanh / khôi phục có sẵn trong Elasticsearch cho việc này. Khi bạn đã thiết lập một kho lưu trữ ảnh chụp nhanh dựa trên Hệ thống tập tin, bạn có thể di chuyển nó giữa các cụm và khôi phục trên một cụm khác

Question 5

Tôi đã thử trên ubuntu để di chuyển dữ liệu từ ELK 2.4.3 sang ELK 5.1.1

Sau đây là các bước

$ sudo apt-get update

$ sudo apt-get install -y python-software-properties python g++ make

$ sudo add-apt-repository ppa:chris-lea/node.js

$ sudo apt-get update

$ sudo apt-get install npm

$ sudo apt-get install nodejs

$ npm install colors

$ npm install nomnom

$ npm install elasticdump

trong goto thư mục chính

$ cd node_modules/elasticdump/

thực hiện lệnh

Nếu bạn cần xác thực http cơ bản, bạn có thể sử dụng nó như sau:

--input=http://name:password@localhost:9200/my_index

Sao chép một chỉ mục từ sản xuất:

$ ./bin/elasticdump --input="http://Source:9200/Sourceindex" --output="http://username:password@Destination:9200/Destination_index"  --type=data

Question 6

Ngoài ra còn có _reindex tùy chọn

Từ tài liệu:

Thông qua API lập chỉ mục Elasticsearch, có sẵn trong phiên bản 5.x trở lên, bạn có thể kết nối từ xa việc triển khai Dịch vụ Elasticsearch mới với cụm Elasticsearch cũ của mình. Thao tác này kéo dữ liệu từ cụm cũ của bạn và lập chỉ mục nó vào cụm mới của bạn. Reindexing về cơ bản xây dựng lại chỉ mục từ đầu và nó có thể tốn nhiều tài nguyên hơn để chạy.

POST _reindex
{
  "source": {
    "remote": {
      "host": "https://REMOTE_ELASTICSEARCH_ENDPOINT:PORT",
      "username": "USER",
      "password": "PASSWORD"
    },
    "index": "INDEX_NAME",
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "INDEX_NAME"
  }
}

Question 7

Nếu bạn có thể thêm máy chủ thứ hai vào cụm, bạn có thể thực hiện việc này:

Thêm Máy chủ B vào cụm với Máy chủ A
Tăng số lượng bản sao cho các chỉ số
ES sẽ tự động sao chép các chỉ số vào máy chủ B
Đóng máy chủ A
Giảm số lượng bản sao cho các chỉ số

Điều này sẽ chỉ hoạt động nếu số lượng thay thế bằng số nút.

Question 8

Nếu bất kỳ ai gặp phải vấn đề tương tự, khi cố gắng kết xuất từasticsearch <2.0 đến> 2.0, bạn cần thực hiện:

elasticdump --input=http://localhost:9200/$SRC_IND --output=http://$TARGET_IP:9200/$TGT_IND --type=analyzer
elasticdump --input=http://localhost:9200/$SRC_IND --output=http://$TARGET_IP:9200/$TGT_IND --type=mapping
elasticdump --input=http://localhost:9200/$SRC_IND --output=http://$TARGET_IP:9200/$TGT_IND --type=data --transform "delete doc.__source['_id']"

Question 9

Tôi đã luôn thành công khi chỉ cần sao chép thư mục / thư mục chỉ mục sang máy chủ mới và khởi động lại nó. Bạn sẽ tìm thấy id chỉ mục bằng cách thực hiện GET /_cat/indicesvà thư mục khớp với id này nằm trong data\nodes\0\indices(thường bên trong thư mục tìm kiếm đàn hồi của bạn trừ khi bạn di chuyển nó).

Question 10

Chúng ta có thể sử dụng elasticdumphoặcmultielasticdump để sao lưu và khôi phục nó, Chúng tôi có thể di chuyển dữ liệu từ máy chủ / cụm máy chủ này sang máy chủ / cụm máy chủ khác.

Vui lòng tìm câu trả lời chi tiết mà tôi đã cung cấp ở đây .

Question 11

Nếu bạn chỉ cần chuyển dữ liệu từ một máy chủ tìm kiếm đàn hồi này sang một máy chủ tìm kiếm khác, bạn cũng có thể sử dụng chuyển tài liệu đàn hồi .

Các bước:

Mở một thư mục trong thiết bị đầu cuối của bạn và chạy
$ npm install elasticsearch-document-transfer.
Tạo một tệp config.js
Thêm chi tiết kết nối của cả hai máy chủ tìm kiếm đàn hồi trong config.js
Đặt các giá trị thích hợp trong options.js
Chạy trong thiết bị đầu cuối
$ node index.js

Question 12

Bạn có thể chụp nhanh trạng thái hoàn chỉnh của cụm của mình (bao gồm tất cả các chỉ số dữ liệu) và khôi phục chúng (sử dụng API khôi phục) trong cụm hoặc máy chủ mới.