Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-16205

dali runs out memory

    XMLWordPrintable

    Details

    • Compatibility:
      Minor

      Description

      back on the 13 th, we ran out of threads here and upped the thread count…

      [fernanux@bair_prod_thor:bair-analytics-prod-thor-dali server]$ grep -c "ERROR: pthread_create" DaServer.2016_08_*.log
      DaServer.2016_08_01.log:0
      DaServer.2016_08_02.log:0
      DaServer.2016_08_03.log:0
      DaServer.2016_08_04.log:0
      DaServer.2016_08_05.log:0
      DaServer.2016_08_06.log:0
      DaServer.2016_08_07.log:0
      DaServer.2016_08_08.log:0
      DaServer.2016_08_09.log:0
      DaServer.2016_08_10.log:0
      DaServer.2016_08_11.log:0
      DaServer.2016_08_12.log:0
      DaServer.2016_08_13.log:81
      DaServer.2016_08_14.log:0

      So we upped the thread count.

      [fernanux@bair_prod_thor:bair-analytics-prod-thor-dali server]$ cat /proc/sys/kernel/pid_max
      2097152
      ✔ complete at 14:27:57
      [fernanux@bair_prod_thor:bair-analytics-prod-thor-dali server]$

      erver.2016_08_13.log:0027BDA3 2016-08-13 15:39:19.524 8605 8610 "SYS: PU= 18% MU= 99% MAL=14705985040 MMP=204746752 SBK=14501238288 TOT=15020788K RAM=15898564K SWP=0K"
      DaServer.2016_08_13.log:0027BDB6 2016-08-13 15:40:21.370 8605 8610 "SYS: PU= 21% MU= 98% MAL=14689198512 MMP=204746752 SBK=14484451760 TOT=15004404K RAM=15886344K SWP=0K"
      DaServer.2016_08_13.log:0027C165 2016-08-13 15:41:21.464 8605 8610 "SYS: PU= 4% MU= 98% MAL=14689194112 MMP=204746752 SBK=14484447360 TOT=15004404K RAM=15882668K SWP=0K"
      DaServer.2016_08_13.log:00000054 2016-08-13 15:45:17.101 2906 2911 "SYS: PU= 17% MU= 12% MAL=1395987184 MMP=130035712 SBK=1265951472 TOT=1366860K RAM=1986856K SWP=0K"
      DaServer.2016_08_13.log:00000075 2016-08-13 15:46:17.102 2906 2911 "SYS: PU= 0% MU= 12% MAL=1396008944 MMP=130035712 SBK=1265973232 TOT=1366884K RAM=1989152K SWP=0K"

      Ran out of memory today..

      0557 2016-09-01 13:07:31.468 2906 2911 "SYS: PU= 28% MU= 99% MAL=14883419648 MMP=197144576 SBK=14686275072 TOT=15239152K RAM=15886992K SWP=0K"
      0023055C 2016-09-01 13:08:31.948 2906 2911 "SYS: PU= 33% MU= 99% MAL=14900206848 MMP=197144576 SBK=14703062272 TOT=15255536K RAM=15898732K SWP=0K"
      0000000D 2016-09-01 13:09:47.894 223022 223027 "SYS: PU= 24% MU= 13% MAL=1537901568 MMP=130035712 SBK=1407865856 TOT=1503472K RAM=2114664K SWP=0K"
      000000B5 2016-09-01 13:10:47.895 223022 223027 "SYS: PU= 2% MU= 13% MAL=1538166384 MMP=130035712 SBK=1408130672 TOT=1505668K RAM=2124532K SWP=0K"
      000000B9 2016-09-01 13:11:47.896 223022 223027 "SYS: PU= 0% MU= 13% MAL=1538168320 MMP=130035712 SBK=1408132608 TOT=1505668K RAM=2125888K SWP=0K"
      000000F0 2016-09-01 13:12:47.897 223022 223027 "SYS: PU= 1% MU= 13% MAL=1538313776 MMP=130035712 SBK=1408278064 TOT=1528340K RAM=2142312K SWP=0K"
      0000010D 2016-09-01 13:13:47.898 223022 223027 "SYS: PU= 1% MU= 13% MAL=1539195248 MMP=130035712 SBK=1409159536 TOT=1529664K RAM=2143880K SWP=0K"

      [fernanux@bair_prod_thor:bair-analytics-prod-thor-dali ~]$ sudo dpkg -l|grep hpcc
      ii hpccsystems-platform 5.2.4-1 amd64 hpccsystems-platform-internal
      ✔ complete at 15:16:16
      [fernanux@bair_prod_thor:bair-analytics-prod-thor-dali ~]

        Attachments

          Activity

            People

            • Assignee:
              jakesmith Jake Smith
              Reporter:
              fuceta Fernando Uceta
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: