[OpenStack-Infra] translate-dev wildfly/zanata/trove issues

Ian Y. Choi ianyrchoi at gmail.com
Mon Jan 16 04:15:31 UTC 2017


Thanks a lot for such sharing Ian! - not me :)

I am copying to i18n mailing list and Zanata dev team for FYI purpose
(as what Kato-san kindly RST markups for option on another mail thread).

Note that this issue is re-generatable: I am able recreate the issue on 
translate-dev
: When I create a new version from openstack-manuals from master branch 
[1] - 20107950 words,
there is no further web responses from translate-dev.o.o around after 
3/4 of total words were processed
like [2-4]. It was completely fine for a version creation with 2610150 
words.

If someone has a good idea to further investigate on it, please share to 
us :)


[1] 
https://translate-dev.openstack.org/iteration/view/openstack-manuals/master
[2] 
https://translate-dev.openstack.org/iteration/view/openstack-manuals/stable-newton
[3] 
https://translate-dev.openstack.org/iteration/view/openstack-manuals/test-version-creation
[4] 
https://translate-dev.openstack.org/iteration/view/openstack-manuals/test-version-creation2



With many thanks,

/Ian

Ian Wienand wrote on 1/16/2017 12:03 PM:
> Hi,
>
> I was alerted to translate-dev performance issues today.  Indeed, it
> seemed that things were going crazy with the java wildfly process
> sucking up all CPU.
>
> At first there didn't seem to be anything in the logs.  Java was
> clearly going mad however, with the following threads going flat-out.
>
> --- gone crazy processes ---
> 14807 wildfly   20   0  9.892g 4.729g  34232 R 93.7 60.7  26:56.73 java
> 14804 wildfly   20   0  9.892g 4.729g  34232 R 92.4 60.7  26:51.91 java
> 14806 wildfly   20   0  9.892g 4.729g  34232 R 92.4 60.7  26:53.92 java
> 14808 wildfly   20   0  9.892g 4.729g  34232 R 92.4 60.7  26:53.97 java
> 14810 wildfly   20   0  9.892g 4.729g  34232 R 92.4 60.7  26:56.28 java
> 14809 wildfly   20   0  9.892g 4.729g  34232 R 92.1 60.7  26:57.74 java
> 14803 wildfly   20   0  9.892g 4.729g  34232 R 91.1 60.7  26:54.90 java
> 14805 wildfly   20   0  9.892g 4.729g  34232 R 90.4 60.7  26:52.44 java
> ---
>
> Hoping to find a easy smoking-gun, I made the java process dump it's
> threads to see what these are doing
>
> --- thread dump ---
>
> 14807  14804  14806  14808  14810  14809  14803, 14805
> 0x39d7 0x39d4 0x39d6 0x39d8 0x39da 0x39d9 0x39d3 0x39d5
>
> "GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007fb2a8026000 nid=0x39d7 runnable
> "GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007fb2a8026000 nid=0x39d7 runnable
> "GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007fb2a8026000 nid=0x39d7 runnable
> "GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007fb2a8026000 nid=0x39d7 runnable
>
> "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fb2a8021000 nid=0x39d4 runnable
> "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fb2a8021000 nid=0x39d4 runnable
> "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fb2a8021000 nid=0x39d4 runnable
> "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fb2a8021000 nid=0x39d4 runnable
>
> "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fb2a8024800 nid=0x39d6 runnable
> "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fb2a8024800 nid=0x39d6 runnable
> "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fb2a8024800 nid=0x39d6 runnable
> "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fb2a8024800 nid=0x39d6 runnable
>
> "GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007fb2a8028000 nid=0x39d8 runnable
> "GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007fb2a8028000 nid=0x39d8 runnable
> "GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007fb2a8028000 nid=0x39d8 runnable
> "GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007fb2a8028000 nid=0x39d8 runnable
>
> "GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007fb2a802b800 nid=0x39da runnable
> "GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007fb2a802b800 nid=0x39da runnable
> "GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007fb2a802b800 nid=0x39da runnable
> "GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007fb2a802b800 nid=0x39da runnable
>
> "GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007fb2a8029800 nid=0x39d9 runnable
> "GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007fb2a8029800 nid=0x39d9 runnable
> "GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007fb2a8029800 nid=0x39d9 runnable
> "GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007fb2a8029800 nid=0x39d9 runnable
>
> "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fb2a801f000 nid=0x39d3 runnable
> "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fb2a801f000 nid=0x39d3 runnable
> "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fb2a801f000 nid=0x39d3 runnable
> "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fb2a801f000 nid=0x39d3 runnable
> ---
>
> Unfortunately, these all appear to be GC threads so there's nothing
> really obvious there.
>
> However, eventually in the log you start getting stuff like
>
> ---
> 2017-01-16T01:58:00,844Z SEVERE [javax.enterprise.resource.webcontainer.jsf.application] (default task-43) Error Rendering View[/project/project.xhtml]: javax.el.ELException: /WEB-INF/layout/project/settings-tab-languages.xhtml @117,88 rendered="#{not projectHome.hasLocaleAlias(locale)}": javax.persistence.PersistenceException: org.hibernate.HibernateException: Transaction was rolled back in a different thread!
> ...
> 2017-01-16T01:08:48,805Z WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (pool-5-thread-1) SQL Error: 0, SQLState: null
> 2017-01-16T01:08:48,806Z ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (pool-5-thread-1) javax.resource.ResourceException: IJ000460: Error checking for a transaction
> 2017-01-16T01:08:48,813Z ERROR [org.hibernate.AssertionFailure] (pool-5-thread-1) HHH000099: an assertion failure occured (this may indicate a bug in Hibernate, but is more likely due to unsafe use of the session): org.hibernate.exception.GenericJDBCException: Could not open connection
> 2017-01-16T01:08:48,813Z WARN  [com.arjuna.ats.jta] (pool-5-thread-1) ARJUNA016029: SynchronizationImple.afterCompletion - failed for org.hibernate.engine.transaction.synchronization.internal.RegisteredSynchronization at 3fc123aa with exception: org.hibernate.AssertionFailure: Exception releasing cache locks
> ---
>
> There's literally hundreds of lines of traceback across various
> exceptions, but at the bottom this really seems to point to mysql
> being at the root of all this?
>
> In this case, mysql is running in RAX's cloud db service.  I logged
> into the RAX control-panel and everything for the DB was green.  I
> used the same credentials as zanata to authenticate and I can connect
> to the db via translate-dev.o.o and do a few basic things.
>
> ---
> MySQL [(none)]> USE zanata;
> Reading table information for completion of table and column names
> You can turn off this feature to get a quicker startup with -A
>
> Database changed
> MySQL [zanata]> show tables;
> +----------------------------------+
> | Tables_in_zanata                 |
> +----------------------------------+
> | Activity                         |
> | DATABASECHANGELOG                |
> ...
> ---
>
> So it's not as simple as "db not working" :( I have copied the console
> log to [1] in case that makes any sense to zanata people.
>
> If this persists, and isn't a known Zanata issue, I think we might
> have to enable root access for the db via the API which might allow us
> to do some thing like debug long-queries.  Otherwise, there might be
> something in the mysql logs available to RAX support people that would
> otherwise help identify the issue.  OTOH, maybe it's as simple as
> upping java heaps or something.
>
> In the mean time, I have restarted wildfly which seems to get things
> going for a while.
>
> -i
>
> [1] http://people.redhat.com/~iwienand/console.log.translate-dev-2017-01-16.log.gz
>
> _______________________________________________
> OpenStack-Infra mailing list
> OpenStack-Infra at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra





More information about the OpenStack-Infra mailing list