Tại sao OOM-killer được gọi ngay cả khi dung lượng RAM lớn là miễn phí


8

Trên phiên bản xlarge ub Ubuntu được hỗ trợ bởi ebs EC2, oom-killer đang được gọi. Từ đầu ra / var / log / syslog bên dưới, có vẻ như ZONE_NORMAL sắp hết bộ nhớ:

Node 0 Normal free:11344kB min:11556kB low:14444kB high:17332kB active_anon:10936284kB inactive_anon:144kB active_file:688kB inactive_file:740kB

nhưng tại sao ZONE_NORMAL chỉ được phân bổ 11 MB trong tổng số 15GB RAM? Hoặc có một số lý do khác cho việc hết bộ nhớ?

RAM có sẵn trên máy (ví dụ xlarge) là 15GB. Tổng cột rss trong nhật ký bên dưới là 3,7 GB và tổng số cột là 11,4 GB.

Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456146] node invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0, oom_score_adj=0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456153] node cpuset=/ mems_allowed=0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456157] Pid: 639, comm: node Not tainted 2.6.38-1-virtual #28-Ubuntu
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456160] Call Trace:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456170]  [<ffffffff810b259d>] ? cpuset_print_task_mems_allowed+0x9d/0xb0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456189]  [<ffffffff81106a01>] ? dump_header+0x91/0x1e0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456196]  [<ffffffff81007e7f>] ? xen_restore_fl_direct_end+0x0/0x1
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456203]  [<ffffffff815c92de>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456208]  [<ffffffff8110705d>] ? oom_kill_process+0x8d/0x180
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456213]  [<ffffffff81107492>] ? out_of_memory+0x102/0x240
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456218]  [<ffffffff8110c209>] ? __alloc_pages_nodemask+0x709/0x7f0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456224]  [<ffffffff81105c78>] ? filemap_fault+0x398/0x490
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456230]  [<ffffffff811414bd>] ? alloc_pages_current+0x9d/0x110
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456235]  [<ffffffff8104338b>] ? pte_alloc_one+0x1b/0x50
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456239]  [<ffffffff8100530e>] ? xen_pud_val+0xe/0x10
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456245]  [<ffffffff811255c5>] ? __pte_alloc+0x35/0x100
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456250]  [<ffffffff81129559>] ? handle_mm_fault+0x129/0x250
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456257]  [<ffffffff815ccf82>] ? do_page_fault+0x1a2/0x540
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456262]  [<ffffffff8109aacb>] ? do_futex+0xbb/0x210
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456266]  [<ffffffff8109ac9b>] ? sys_futex+0x7b/0x180
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456274]  [<ffffffff814c2975>] ? sys_recvmsg+0x75/0x90
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456278]  [<ffffffff815c9a15>] ? page_fault+0x25/0x30
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456282] Mem-Info:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456284] Node 0 DMA per-cpu:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456288] CPU    0: hi:    0, btch:   1 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456290] CPU    1: hi:    0, btch:   1 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456293] CPU    2: hi:    0, btch:   1 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456295] CPU    3: hi:    0, btch:   1 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456299] Node 0 DMA32 per-cpu:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456303] CPU    0: hi:  186, btch:  31 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456307] CPU    1: hi:  186, btch:  31 usd:  81
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456310] CPU    2: hi:  186, btch:  31 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456314] CPU    3: hi:  186, btch:  31 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456317] Node 0 Normal per-cpu:
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456321] CPU    0: hi:  186, btch:  31 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456323] CPU    1: hi:  186, btch:  31 usd: 175
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456326] CPU    2: hi:  186, btch:  31 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456329] CPU    3: hi:  186, btch:  31 usd:   0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456334] active_anon:3726183 inactive_anon:44 isolated_anon:0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456335]  active_file:214 inactive_file:237 isolated_file:64
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456336]  unevictable:0 dirty:1 writeback:0 unstable:0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456337]  free:18955 slab_reclaimable:2811 slab_unreclaimable:7277
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456338]  mapped:256 shmem:101 pagetables:12634 bounce:0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456341] Node 0 DMA free:15880kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(
anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456353] lowmem_reserve[]: 0 4024 15142 15142
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456360] Node 0 DMA32 free:48596kB min:4180kB low:5224kB high:6268kB active_anon:3968448kB inactive_anon:32kB active_file:168kB inactive_file:208kB unevic
table:0kB isolated(anon):0kB isolated(file):0kB present:4120800kB mlocked:0kB dirty:0kB writeback:0kB mapped:328kB shmem:80kB slab_reclaimable:2208kB slab_unreclaimable:8604kB kernel_stack:48kB pagetabl
es:9724kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:601 all_unreclaimable? yes
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456371] lowmem_reserve[]: 0 0 11117 11117
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456377] Node 0 Normal free:11344kB min:11556kB low:14444kB high:17332kB active_anon:10936284kB inactive_anon:144kB active_file:688kB inactive_file:740kB
unevictable:0kB isolated(anon):0kB isolated(file):256kB present:11384720kB mlocked:0kB dirty:4kB writeback:0kB mapped:696kB shmem:324kB slab_reclaimable:9036kB slab_unreclaimable:20504kB kernel_stack:20
16kB pagetables:40812kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2190 all_unreclaimable? yes
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456388] lowmem_reserve[]: 0 0 0 0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456393] Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456409] Node 0 DMA32: 715*4kB 525*8kB 343*16kB 245*32kB 147*64kB 50*128kB 16*256kB 5*512kB 2*1024kB 0*2048kB 1*4096kB = 48996kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456423] Node 0 Normal: 1919*4kB 5*8kB 2*16kB 2*32kB 1*64kB 1*128kB 2*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 11588kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456438] 768 total pagecache pages
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456440] 0 pages in swap cache
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456443] Swap cache stats: add 0, delete 0, find 0/0
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456445] Free swap  = 0kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456447] Total swap = 0kB
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495285] 3934192 pages RAM
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495290] 91275 pages reserved
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495291] 1661 pages shared
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495293] 3822220 pages non-shared
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495296] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495310] [  228]     0   228     4297       71   2       0             0 upstart-udev-br
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495315] [  242]     0   242     5353      123   2     -17         -1000 udevd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495319] [  319]     0   319     5352      126   2     -17         -1000 udevd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495323] [  320]     0   320     5352      126   0     -17         -1000 udevd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495327] [  392]     0   392     1771      124   2       0             0 dhclient
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495332] [  467]     0   467    12366      140   3     -17         -1000 sshd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495336] [  474]   101   474    47476      160   2       0             0 rsyslogd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495340] [  502]   102   502     5927      108   2       0             0 dbus-daemon
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495344] [  534]     0   534     1035       27   3       0             0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495348] [  539]     0   539     1035       27   0       0             0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495352] [  543]     0   543    13589     2510   1       0             0 supervisord
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495356] [  544]     0   544     1035       25   2       0             0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495360] [  545]     0   545     1035       27   2       0             0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495364] [  547]     0   547     1035       27   2       0             0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495368] [  552]     0   552     5302       65   1       0             0 cron
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495372] [  553]     0   553     4753       44   1       0             0 atd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495376] [  565]     0   565     3946       46   2       0             0 irqbalance
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495380] [  571]   106   571    35760     3859   1       0             0 memcached
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495384] [  637]  1000   637   222128    36427   1       0             0 node
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495388] [  639]  1000   639   189634     4396   1       0             0 node
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495392] [  696]     0   696     1035       27   3       0             0 getty
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495396] [  755]     0   755    31312      372   2       0             0 console-kit-dae
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495401] [ 2404]     0  2404    35058     3644   1       0             0 carbon-cache.py
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495405] [ 2601]  1000  2601     5982      199   1       0             0 screen
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495409] [ 2602]  1000  2602     5576      766   0       0             0 bash
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495413] [ 2661]     0  2661     6801       67   1       0             0 sudo
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495417] [ 2663]     0  2663     7133      851   1       0             0 run-graphite-de
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495421] [ 2668]     0  2668     1056       25   1       0             0 sh
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495425] [ 2669]     0  2669    12906     3942   1       0             0 django-admin.py
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495429] [ 2670]     0  2670    50313    11546   1       0             0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495433] [ 2785]     0  2785    15953      489   1       0             0 nginx
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495437] [ 3123]  1000  3123  1184911   266122   1       0             0 java
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495443] [18135]     0 18135  5884532     9128   1       0             0 mongod
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495449] [12574]  1000 12574   966781   932357   2       0             0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495455] [30139]  1000 30139   759010   741642   1       0             0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495461] [14186]    33 14186    19905     4657   1       0             0 nginx
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495465] [14187]    33 14187    19726     4527   1       0             0 nginx
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495470] [14190]  1000 14190    23767     8459   1       0             0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495474] [14201]  1000 14201  1706261  1687237   1       0             0 python
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495478] [24553]     0 24553    19831      200   1       0             0 sshd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495482] [24584]  1000 24584    19831      200   1       0             0 sshd
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495486] [24585]  1000 24585     6464     1655   1       0             0 bash
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495490] Out of memory: Kill process 14201 (python) score 439 or sacrifice child
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495509] Killed process 14201 (python) total-vm:6825044kB, anon-rss:6748848kB, file-rss:100kB

Cảm ơn bạn đã giúp đỡ!


4
Vì bạn chưa bị tàn tật oom-killer?
HoplessN00b

2
Có lẽ bạn nên bao gồm thông tin như làm thế nào để làm điều đó.
mdpc

pmap -x <pid> hiển thị giá trị rss tính bằng kb. Trong báo cáo sự cố oom, cột rss có số lượng trang. Nhân nó với kích thước trang (4kB).
tarkeshwar

Câu trả lời:


13

Có vẻ như OOM khá hợp pháp đối với tôi:

  1. Bạn có một nút NUMA (cụ thể là Nút 0) có ba vùng:
    1. DMA có 15880kB miễn phí
    2. DMA32 có 48996kB miễn phí
    3. Normal trong đó có 11588kB miễn phí
  2. Hầu như tất cả bộ nhớ của bạn 3968448kB + 10936284kBnằm trong active_anondanh sách LRU.
  3. Kích thước bộ nhớ trong bảng nằm trong các trang (mà trên kiến ​​trúc của bạn rất có thể là 4kb) vì vậy chúng có tổng số lên tới 14,8Gb chứ không phải 3,7.

0

Cả hai đều đúng 3,7Gb (4) = 14,8Gb. Mỗi giá trị rss phải được nhân với kích thước của các trang để có kích thước thực được phân bổ trong bộ nhớ. Trong Bộ mô tả bộ nhớ, rss chỉ là số lượng / * trang được phân bổ * /

Trân trọng

Khi sử dụng trang web của chúng tôi, bạn xác nhận rằng bạn đã đọc và hiểu Chính sách cookieChính sách bảo mật của chúng tôi.
Licensed under cc by-sa 3.0 with attribution required.