Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-18064

ESP leaving many sockets in CLOSE_WAIT state

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Unresourced
    • Priority: Not specified
    • Resolution: Unresolved
    • Affects Version/s: 6.2.14
    • Fix Version/s: None
    • Component/s: Core Libraries, ESP
    • Labels:
      None
    • Environment:
      CentOS 6.4

      Description

      Build = 6.2.14-1
      OS = CentOS 6.4

      Attached the email from the previous time I resolved the same problem, just to show this is a reoccurring problem at this point.

      The problem is that the outbound connections from the ESP to dafilesrv are exhausting the ephemeral port range on the ESP. The ESP is keeping 28k connections in a CLOSE_WAIT state. I attempted setting the net.ipv4.tcp_tw_recycle and net.ipv4.tcp_tw_reuse individually, but neither appeared to help in cleaning up the CLOSE_WAIT connections.

      netstat -plan | grep 10.195.52
      tcp 1 0 10.241.70.54:47365 10.195.52.18:7100 CLOSE_WAIT 58193/esp
      tcp 1 0 10.241.70.54:43741 10.195.52.18:7100 CLOSE_WAIT 58193/esp
      tcp 1 0 10.241.70.54:55229 10.195.52.18:7100 CLOSE_WAIT 58193/esp
      tcp 1 0 10.241.70.54:41092 10.195.52.18:7100 CLOSE_WAIT 58193/esp
      tcp 1 0 10.241.70.54:39022 10.195.52.18:7100 CLOSE_WAIT 58193/esp
      tcp 1 0 10.241.70.54:55004 10.195.52.18:7100 CLOSE_WAIT 58193/esp
      tcp 1 0 10.241.70.54:54930 10.195.52.18:7100 CLOSE_WAIT 58193/esp

      netstat -plan | grep -c CLOSE_WAIT
      28226

      cat /proc/sys/net/ipv4/ip_local_port_range
      32768 61000
      Total ephemerals = 28232

      I restarted the ESP which successfully reclaimed all of the used ephemeral ports. I’m not sure if this is a bug with the HPCC build, or how the users are hitting it (perhaps with a script)?

      You should be able to connect to that LZ again without issue… till the next time you exhaust those client side ports.

      Server IP: 10.195.52.18
      We’re having issue trying to access the DEV landingzone for 10.241.70.54:8010 ? Here’s the error:

        Attachments

          Activity

            People

            • Assignee:
              mckellyln Mark Kelly
              Reporter:
              mckellyln Mark Kelly
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: