[OpenStack-Infra] log-classify project update (anomaly detection in CI/CD logs)
Tristan Cacqueray
tdecacqu at redhat.com
Tue Jul 3 07:39:58 UTC 2018
Hello,
This is a follow-up to the initial project creation thread[0].
At the Vancouver Summit, we met to discuss ML for CI[1] and I lead a workshop
on logreduce[2]. The log-classify project bootstrap is still waiting
for review[3] and I am still looking forward to pushing logreduce[4]
source code in openstack-infra/log-classify.
The current implementation is working fine and I am going to enable it
for every job running on Software Factory. However the core
process HashingNeighbors[5] is rather slow (0.3MB per second) and I
would like to improve it and/or implement other algorithms.
To do that effectively, we need to gather more datasets[6]. I would like
to propose some enhancements to the os-loganalyze[7] middleware to enable
users to annotate and report anomalies they find in log files.
To store the anoamlies reference, an additional webservice, or
perhaps direct access to an elasticsearch cluster would be required.
In parallel, we need to collect the users' feedback and create datasets daily
using the baseline available at the time each anomaly was discovered.
Ideally, we would create an ipfs (or any other network filesystem) that
could then be used by anyone willing to work on $subject.
There is a lot to do and it will be challening. To that effect, I would
like to propose an initial meeting with all interested parties.
Please register your irc name and timezone in this etherpad:
https://etherpad.openstack.org/p/log-classify
Due to OpenStack's exceptional infrastructure and recent Zuul v3 release,
I think we are in a strong position to tackle this challenge.
Others suggestions to bootstrap this effort within our community are welcome.
Best regards,
-Tristan
[0] http://lists.openstack.org/pipermail/openstack-infra/2017-November/005676.html
[1] https://etherpad.openstack.org/p/YVR-ml-ci-results
[2] https://github.com/TristanCacqueray/anomaly-detection-workshop-opendev
[3] https://review.openstack.org/#/q/topic:crm-import
[4] git clone https://softwarefactory-project.io/r/logreduce
[5] https://softwarefactory-project.io/cgit/logreduce/tree/logreduce/models.py
[6] https://softwarefactory-project.io/cgit/logreduce-tests/tree/tests
[7] https://review.openstack.org/#/q/topic:loganalyze-user-feedback
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-infra/attachments/20180703/98a749d8/attachment.sig>
More information about the OpenStack-Infra
mailing list