Announcement

Collapse
No announcement yet.

Scalability not working as advertised/expected

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Scalability not working as advertised/expected

    Hello OTM'ers!
    - My customer is testing on 5.5 CU4. Application and Web Servers all running on Sun Solaris 9 (64-bit).

    - Configuration is 2 web server processes (Apache/Tomcat) and 2 Application Servers processes (Default and App99).

    - Defined a single DEFAULT cluster, made up of both Application Servers (Default and App99) with equal weights of 1.

    - This is an active-active configuration (Failover box is NOT checked for either App Server).

    - No function based routing, No Domain based routing, etc. Plain vanilla, with expectation that ALL work gets routed to all App Servers (Scalability Topology).

    - Integration is using SOAP WSDL directly to App Server (Default Server), bypassing web servers altogether (NO HTTP POST or AQ being used).

    - Admin panels confirm the configuration and all check marks indicate everything is working as designed.

    - Scalability Guide (June 200 indicates this single default cluster with multiple app servers (and no explicit routing) should result in all work being routed to all App Servers in the cluster.

    - Over 5,000 orders "pumped" (via Integration) in a short period of time.

    - Unfortunately, Default Server is handling ALL activity. None of the work (read: Zero) is being processed by the second App Server (APP99) in the Default Cluster.


    Any insight/assistance would be greatly appreciated.

    Thanks!
    LenDB2

  • #2
    Re: Scalability not working as advertised/expected

    You'll need to check the eventdiagservlet to be sure that the 2nd app server isn't doing anything.

    http://otm/GC3/glog.webserver.event.EventDiagServlet
    If my post was helpful please click on the Thanks! button

    MavenWire Hosting Admin
    15 years of OTM experience

    Comment


    • #3
      Re: Scalability not working as advertised/expected

      We did check the eventdiagservlet during and immediately after our batch, and only saw counters (GT 0) on the first App Server (default). The exception was an event named ProcessSweeper (sp?) which had a relatively small number. All of the counters for the default server like Integration, Outbound integration, planning, publishing, etc had activity which represented all of the work processed.

      Comment


      • #4
        Re: Scalability not working as advertised/expected

        A bulk plan will only run on one app server. It is not shared across two apps when one is being processed. If you sent in say 10 bulk plans then that should be scaled across both machines.

        One major problem I found out with clients is that some tend not to use DNS names for the servers. If this is the case for you it could be the problem. Scalability uses gc3-hostnames for part of the scalability. If when you installed the application you entered the DNS name as an IP such as 10.10.10.10 the apps would have both been assigned gc3-10 which is a bad thing. Let me know if this was the case and I'll tell you how to fix it.
        If my post was helpful please click on the Thanks! button

        MavenWire Hosting Admin
        15 years of OTM experience

        Comment


        • #5
          Re: Scalability not working as advertised/expected

          Please keep in mind that the integration is bypassing the Web Servers altogether (SOAP WSDL) and is "talking" with the default Application Server, which is in the Default cluster. The second application server is also part of the Default cluster. Both App Servers are on the same physical box.

          The Scalability guide indicates (they refer to it as "Scalability Topology") that having multiple App servers in the Default cluster, both equally weighted, should result in work being distributed to both. We are only see one App Server (the Default) process work for our big batch of work. Hope this helps.

          Comment


          • #6
            Re: Scalability not working as advertised/expected

            Like I mentioned before if you have 1 large process running it will not balance across the machines. If you have multiple pieces of integration going in, you should see it balancing across both machines.

            I really can't comment more than what I have given you without looking into your environment and seeing the actual setup. In my opinion scalability is one of the most difficult things to setup in OTM and has taken me quite a while to compile a step by step guide for both active-active and active-failover. I would strongly recommend you logging an issue with Oracle support so that they can look into the issue for you.
            If my post was helpful please click on the Thanks! button

            MavenWire Hosting Admin
            15 years of OTM experience

            Comment


            • #7
              Re: Scalability not working as advertised/expected

              LenDB2,

              I could be wrong but from what I heard the distribution of tasks across the app servers is handled by a stub on the tomcat server (tasks initiated internally such as posting through AQ may arbitrate through other means?). I suspect that since you are posting integration directly to the default app server and not via HTTP POST through the web servers (or via AQ) that you are effectively bypassing the tomcat load balancer code. I also think the design of the load balancer is such that it sends entire tasks to one server or the other and all the threads associated with that task will also reside on that server. It does not take into account factors such as the CPU utilization of the server or the size of the task it simply hands off tasks based on the weight factor you specify.

              So the question is are tasks initiated by the users being handed of to the app servers evenly (as you specified equal weights to both) or are all tasks initiated through web connections ending up on your default server. If the latter is the case I suspect that the web servers may be misconfigured.

              -Alan

              Comment


              • #8
                Re: Scalability not working as advertised/expected

                I agree with Alan. The load-balancing algorithm that the OTM Web servers utilize to send tasks to the OTM App servers is a modified Round-Robin and does not take CPU utilization into consideration.

                --Chris
                Chris Plough
                twitter.com/chrisplough
                MavenWire

                Comment


                • #9
                  Re: Scalability not working as advertised/expected

                  I think the problem could be in correctly setting up OTM, specially when you are trying to set-up 2 OTM app server instances on a single physical box. We had similar situation and lot of problems with that. Fortunately we soon figured out that you need to specify uniq domain names for each OTM instance (As this is used to generate weblogic domain name during installation). Here is what you should do -
                  Create virtual ip address for 2 app servers, map different DNS names to these ip address. e.g. otm-app1 == xx.yy.zz.aaa and otm-app2 == xx.yy.zz.bbb
                  This is very essential to ensure scalability works fine especially if both your instances are on same physical box.

                  Thanks,
                  - Pritam.

                  Comment

                  Working...
                  X