Not all es DSL
Some venus api will be directly converted to es api, some will query es data and return the result after calculation, and some will query mysql data, such as alarms.
发件人: Lei Zhang <lei12zhang12@gmail.com>
发送时间: 2021年1月15日 11:26
收件人: Liye Pang(逄立业) <pangliye@inspur.com>
抄送: openstack-discuss@lists.openstack.org
主题: Re: [all]Introduction to venus which is the project of log management and has been contributed to the OpenStack community
This looks cool.
One question about the Venus api, does it support full Elasticsearch DSL or just a subset of queries
On Mon, Jan 11, 2021 at 4:59 AM Liye Pang(逄立业) <pangliye@inspur.com> wrote:
Hello everyone, after feedback from a large number of operations and maintenance personnel in InCloud OpenStack, we developed the log management project “Venus” for the OpenStack projects and that has contributed to the OpenStack community. The following is an introduction to “Venus”. If there is interest in the community, we are interested in proposing it to become an official OpenStack project in the future.
Background
In the day-to-day operation and maintenance of large-scale cloud platform, the following problems are encountered:
l Time-consuming for log querying while the server increasing to thousands.
l Difficult to retrieve logs, since there are many modules in the platform, e.g. systems service, compute, storage, network and other platform services.
l The large amount and dispersion of log make faults are difficult to be discovered.
l Because of distributed and interaction between components of the cloud platform, and scattered logs between components, it will take more time to locate problems.
About Venus
According to the key requirements of OpenStack in log storage, retrieval, analysis and so on, we introduced Venus project, a unified log management module. This module can provide a one-stop solution to log collection, cleaning, indexing, analysis, alarm, visualization, report generation and other needs, which involves helping operator or maintainer to quickly solve retrieve problems, grasp the operational health of the platform, and improve the management capabilities of the cloud platform.
Additionally, this module plans to use machine learning algorithms to quickly locate IT failures and root causes, and improve operation and maintenance efficiency.
Application scenario
Venus played a key role in the following scenarios:
l Retrieval: Provide a simple and easy-to-use way to retrieve all log and the context.
l Analysis: Realize log association, field value statistics, and provide multi-scene and multi-dimensional visual analysis reports.
l Alerts:Convert retrieval into active alerts to realize the error finding in massive logs.
l Issue location: Establish a chain relationship and knowledge graphs to quickly locate problems.
Overall structure
The architecture of log management system based on Venus and elastic search is as follows:
Diagram 0: Architecture of Venus
venus_api: API module,provide API、rest-api service.
venus_manager: Internal timing task module to realize the core functions of the log system.
Current progress
The current progress of the Venus project is as follows:
l Collection:Develop fluentd collection tasks based on collectd to read, filter, format and send plug-ins for OpenStack, operating systems, and platform services, etc.
l Index:Dealing with multi-dimensional index data in elasticsearch, and provide more concise and comprehensive authentication interface to return query results.
l Analysis:Analyzing and display the related module errors, Mariadb connection errors, and Rabbitmq connection errors.
l Alerts:Develop alarm task code to set threshold for the number of error logs of different modules at different times, and provides alarm services and notification services.
l Location:Develop the call chain analysis function based on global_requested series, which can show the execution sequence, time and error information, etc., and provide the export operation.
l Management:Develop configuration management functions in the log system, such as alarm threshold setting, timing task management, and log saving time setting, etc.
Application examples
Two examples of Venus application scenarios are as follows.
1. The virtual machine creation operation was performed on the cloud platform and it was found that the virtual machine was not created successfully.
First, we can find the request id of the operation and jump to the virtual machine creation call chain page.
Then, we can query the calling process, view and download the details of the log of the call.
2. In the cloud platform, the error log of each module can be converted into alarms to remind the users.
Further, we can retrieve the details of the error log and error log statistics.
Next step
The next step of the Venus project is as follows:
l Collection:In addition to fluent, other collection plugins such as logstash will be integrated.
l Analysis: Explore more operation and maintenance scenarios, and conduct statistical analysis and alarm on key data.
l display: The configuration, analysis and alarm of Venus will be integrated into horizon in the form of plugin.
l location: Form clustering log and construct knowledge map, and integrate algorithm class library to locate the root cause of the fault.
Venus Project Registry
Venus library: https://opendev.org/inspur/venus
You can grab the source code using the following git command:
git clone https://opendev.org/inspur/venus.git
Venus Demo
Youtu.be: https://youtu.be/mE2MoEx3awM