Announcement

Collapse
No announcement yet.

Concurrent integration posts

Collapse
X
 
  • Time
  • Show
Clear All
new posts

  • Concurrent integration posts

    Hi all and ex-Glog friends,
    I am looking for feedback and input on concurrent integration posting volume (WMServlet).

    The issue is.. integration is posting like a dream.
    Then suddenly integration starts posting very slowly.. as in only a few posts per minute.. or even a few minutes between posts..
    *Posting via BizTalk

    I found that an RDP application on the app/web server (vino-server) was eating 99.9% CPU.
    When integration slowed to a crawl, vino-server showed for the first time on TOP, at the top with 99.9 CPU.
    *Direct correlation between integration slowing and vino-server taking 99.9 CPU

    Killing vino-server restored idle CPU. but integration never picked back up..
    Biztalk still posted very slowly.. as described above.. so I am just not clear here..
    I was wondering if maybe there were too many hung connections wanting to timeout.. but wasn't sure.

    I tried to make sense of netstat.. but again wasn't clear exactly..

    I checked google and this is a known issue with vino-server.. taking 100% CPU - it's a bug.

    I am fairly convinced that the vino-server app is the cause of the issue.. but noone seems to want to believe me.
    THus, I am looking to the big guns for some input.

    Quick specs on Test system being used:
    OTM 5.5 CU02 #4
    App and Web are on the same box
    *** App is OAS
    OS = Red Hat Linux
    4 CPU's, 4 GB ram

    DB: 10g 10.2.0.3
    Win 2003
    4 CPU's, 4 GB ram
    SAN storage

    Not high-end specs.. but a Test instance.

    Assuming for a second nothing is wrong with this instance and everything is working ok,
    can someone please provide input on the following:

    My first question is about concurrent posts to WMServlet.
    What actually controls this limit ?
    I see tomcat config shows maxthreads 500

    *CLient uses BizTalk to HTTPPOST and Biztalk can grow to hundreds of concurrent threads.

    I found the following:
    1: Not running out of CPU on web/app
    2: memory is not swapping at all per vmstat
    3: both JVM's appear to be ok as far as memory in the log
    Tomcat: INFO | jvm 1 | 2008/01/14 13:00:10 | 621345K->414715K(1023424K), 0.1054570 secs]
    OAS: [GC 647391K->433253K(1023424K), 0.0487000 secs]
    4: No errors in any logs including Apache access logs
    5: No backlog appearing in Event Diag
    At the highest volume:
    OAS JVM was using 1.6 GB ram (per TOP)
    and Tomcat JVM was using 900 MB ram (per TOP)
    *Again.. zero swapping was taking place and both JVM logs showed ok memory as noted in #3 above.

    If system resources looked ok, can you pump 100's of concurrent posts without causing a problem ?

    For example, the client must upload 100,000 location UPDATES into OTM.
    FOr reasons.. this cannot be via .CSV or DirLoadServlet since they location loads are all updates to existing locations. (IU)

    My 2 goals here are:
    1: How many concurrent threads are realistic to be posting location updates to WMServlet (or any inbound integration to OTM)
    2: Other than the vino-server issue, what else would/could cause integration to just slow to a crawl for no obvious reason ?

    Lastly, any idea what type of expectation there should be on the time to load 100K location updates via WMServlet ?

    Thanks for reading this.

    I greatly appreciate any suggestions or input.

    Best Regards,
    Brian R
    Last edited by BigB; January 18, 2008, 19:38.

  • #2
    Re: Concurrent integration posts

    Brian,

    I agree that Vino is related to the issue, but not sure whether or not it's the direct cause. If killing Vino doesn't resolve the issue, then I suspect your issue is multi-rooted. Initially I would suspect either:
    1. Sun Java JDK performance limitations
    2. VM Swapping
    3. Running out of TCP connections (test via netstat and /var/log/messages)
    4. Running out of Execute Threads (WebLogic only)
    (Note: I included #4 because it's often an issue on a WebLogic server, since this thread group is used by all incoming web connections from Tomcat. I'm sure OAS has a similar thread group, but don't yet know how to adjust it.)

    I'm not aware of a hard-coded limit with WMServlet, but suspect that the Java JDK may be at the root of your issues. OAS uses the Sun JDK which has numerous performance limitations, in comparison to JRockit, which WebLogic uses.

    In addition, can you verify whether or not the system is running out of physical memory, causing swapping (use iostat and vmstat) again? With only 4GB for the web/app, I suspect this may be an issue. If your OTM Java heaps for Tomcat and OAS are 1GB to 1.5GB, I'd recommend at least 6GB on the server, to handle Apache, the 2 Java Heaps, additional Java runtime memory and the OS. I know you stated that this was checked, but I'd verify again -- as this is often an issue on systems with so little memory.

    --Chris
    Chris Plough
    twitter.com/chrisplough
    MavenWire

    Comment


    • #3
      Re: Concurrent integration posts

      Hey Chris,
      Thanks for the reply.

      Below is a spapshot of TOP from last week.
      Free memory seemed very low.. but nothing was swapping.
      (vmstat showed 0 pages swapped in and out at all times also.)

      *Does this look ok to you or do you see issues ?

      I tried to explain that their HW specs are low, especially RAM..
      I am still fighting that battle..

      1 last question:
      If the issue with integration slowing to a crawl happens again:

      I am not ultra-savvy on netstat, so I will see about having the client's IT get involved with some alalysis.
      (I know little about netstat.. I really only have experience using it under windows to look for active connections when I fix hacked or infected boxes for people)

      Other than the stuff I somewhat know to check (top, vmstat, sar, OAS log, tomcat log, apache access log)
      What else, if anything, would you suggest checking to try to nail down the issue ?

      Thanks again.
      I appreciate all your suggestions and time !

      Regards,
      Brian

      Tasks: 117 total, 2 running, 115 sleeping, 0 stopped, 0 zombie
      Cpu(s): 31.2% us, 0.3% sy, 0.0% ni, 68.4% id, 0.0% wa, 0.0% hi, 0.0% si
      Mem: 4149156k total, 4120784k used, 28372k free, 126728k buffers
      Swap: 4194296k total, 160k used, 4194136k free, 1236712k cached
      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      15401 otmqa 21 0 1737m 974m 58m S 43.3 24.0 52:05.11 java
      15266 otmqa 16 0 1716m 1.1g 21m S 18.6 28.8 175:38.22 java
      18507 root 16 0 463m 453m 7640 S 0.7 11.2 831:43.59 gnome-system-mo
      2988 root 15 0 15964 11m 3172 S 0.3 0.3 451:53.34 Xvnc
      21866 root 17 0 3552 964 744 S 0.3 0.0 0:18.97 top
      1 root 16 0 3368 560 480 S 0.0 0.0 0:01.20 init
      2 root RT 0 0 0 0 S 0.0 0.0 0:00.66 migration/0
      3 root 34 19 0 0 0 S 0.0 0.0 1:54.83 ksoftirqd/0

      Comment


      • #4
        Re: Concurrent integration posts

        Brian,

        One more thing to check. Run a ifconfig and see if you have any errors in the network interface. Also have them check the switch for any errors as well. It's a longshot but you never know.


        Nick
        If my post was helpful please click on the Thanks! button

        MavenWire Hosting Admin
        15 years of OTM experience

        Comment


        • #5
          Re: Concurrent integration posts

          Hey Nick,
          ifconfig is new to me. Here is the output.

          Anything you see here that looks suspect ?

          Thanks a lot !
          -Brian

          eth0 Link encap:Ethernet HWaddr 00:18:8B:48:65:62
          inet addr:10.24.10.33 Bcast:10.24.10.255 Mask:255.255.255.0
          inet6 addr: fe80::218:8bff:fe48:6562/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:641631180 errors:0 dropped:0 overruns:0 frame:0
          TX packets:870044551 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3783261113 (3.5 GiB) TX bytes:475892536 (453.8 MiB)
          Interrupt:169 Memory:f4000000-f4011100

          lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:1500 Metric:1
          RX packets:113748100 errors:0 dropped:0 overruns:0 frame:0
          TX packets:113748100 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:812219419 (774.5 MiB) TX bytes:812219419 (774.5 MiB)

          Comment


          • #6
            Re: Concurrent integration posts

            This looks good...you have no errors or collisions on the server. Just have them check also the switch to see if it is generating any errors.
            If my post was helpful please click on the Thanks! button

            MavenWire Hosting Admin
            15 years of OTM experience

            Comment


            • #7
              Re: Concurrent integration posts

              Hey Nick,
              I will request the switch check from the client.

              Thanks again !!
              -Brian

              Comment

              Working...
              X
              😀
              🥰
              🤢
              😎
              😡
              👍
              👎