Hello everyone, after feedback from a large number of operations and maintenance personnel in InCloud OpenStack, we developed the log management project “Venus” for the OpenStack projects and that has contributed to the OpenStack community. The following is an introduction to “Venus”. If there is interest in the community, we are interested in proposing it to become an official OpenStack project in the future.


In the day-to-day operation and maintenance of large-scale cloud platform, the following problems are encountered

l  Time-consuming for log querying while the server increasing to thousands.

l  Difficult to retrieve logs, since there are many modules in the platform, e.g. systems service, compute, storage, network and other platform services.

l  The large amount and dispersion of log make faults are difficult to be discovered.

l  Because of distributed and interaction between components of the cloud platform, and scattered logs between components, it will take more time to locate problems.

About Venus

According to the key requirements of OpenStack in log storage, retrieval, analysis and so on, we introduced Venus project, a unified log management module. This module can provide a one-stop solution to log collection, cleaning, indexing, analysis, alarm, visualization, report generation and other needs, which involves helping operator or maintainer to quickly solve retrieve problems, grasp the operational health of the platform, and improve the management capabilities of the cloud platform.

Additionally, this module plans to use machine learning algorithms to quickly locate IT failures and root causes, and improve operation and maintenance efficiency.

Application scenario

Venus played a key role in the following scenarios

l  Retrieval: Provide a simple and easy-to-use way to retrieve all log and the context.

l  Analysis: Realize log association, field value statistics, and provide multi-scene and multi-dimensional visual analysis reports.

l  AlertsConvert retrieval into active alerts to realize the error finding in massive logs.

l  Issue location: Establish a chain relationship and knowledge graphs to quickly locate problems.

Overall structure

The architecture of log management system based on Venus and elastic search is as follows:

Diagram 0: Architecture of Venus

venus_api: API moduleprovide APIrest-api service.

venus_manager: Internal timing task module to realize the core functions of the log system.

Current progress

The current progress of the Venus project is as follows:

l  CollectionDevelop fluentd collection tasks based on collectd to read, filter, format and send plug-ins for OpenStack, operating systems, and platform services, etc.

l  IndexDealing with multi-dimensional index data in elasticsearch, and provide more concise and comprehensive authentication interface to return query results.

l  AnalysisAnalyzing and display the related module errors, Mariadb connection errors, and Rabbitmq connection errors.

l  AlertsDevelop alarm task code to set threshold for the number of error logs of different modules at different times, and provides alarm services and notification services.

l  LocationDevelop the call chain analysis function based on global_requested series, which can show the execution sequence, time and error information, etc., and provide the export operation.

l  ManagementDevelop configuration management functions in the log system, such as alarm threshold setting, timing task management, and log saving time setting, etc.

Application examples

Two examples of Venus application scenarios are as follows.

1.       The virtual machine creation operation was performed on the cloud platform and it was found that the virtual machine was not created successfully.

First, we can find the request id of the operation and jump to the virtual machine creation call chain page.

Then, we can query the calling process, view and download the details of the log of the call.

2.       In the cloud platform, the error log of each module can be converted into alarms to remind the users.

Further, we can retrieve the details of the error log and error log statistics.

Next step

The next step of the Venus project is as follows:

l  CollectionIn addition to fluent, other collection plugins such as logstash will be integrated.

l  Analysis: Explore more operation and maintenance scenarios, and conduct statistical analysis and alarm on key data.

l  display: The configuration, analysis  and alarm of Venus will be integrated into horizon in the form of plugin.

l  location: Form clustering log and construct knowledge map, and integrate algorithm class library to locate the root cause of the fault.

Venus Project Registry

Venus library: https://opendev.org/inspur/venus

You can grab the source code using the following git command:

git clone https://opendev.org/inspur/venus.git


Venus Demo

Youtu.be: https://youtu.be/mE2MoEx3awM