Home

Welcome to AIOps@NKU lab!

Our lab mainly focuses on artificial intelligence operations (AIOps), which applies AI methods, including traditional machine learning methods (e.g., decision tree, SVM, logistic regression), deep learning methods (e.g., Variational Autoencoder, LSTM, CNN), reinforcement learning methods (e.g., MCTS), knowledge graph methods, graph computing methods, and large language models (LLMs), to analyze the application-level and machine-level data of Internet-based services or Web-based services (say search engine, online shopping, online video, instant messaging). We conduct anomaly detection, root cause analysis, failure localization, failure prediction, etc., on a large volume of data to keep services reliable and improve user experience. We are collaborating/have collaborated with Huawei, Alibaba, Tencent, China Mobile, Microsoft, Baidu, Bytedance, Huya, and CERNET. Thus the data of these companies can be used to evaluate and optimize the model we proposed.

We welcome more undergraduate students to join our lab. You can use real-world data to prove your idea and deploy your model to world-class companies to improve the experience of millions of users! We will recommend you to top global Internet/IT companies such as MSRA, Alibaba, Baidu, Tencent, Bytedance, Huawei, Huya, etc.

Our lab is organized by Dr. Shenglin Zhang and Dr. Yongqian sun. We have published a series of papers, which can be found here. In addition, here are the courses Dr. Shenglin Zhang and Dr. Yongqian Sun taught.

News

  1. 30/10/2024: We received the only two Best Paper Awards at ISSRE ’24! One was for Best Research Paper and the other for Best Industry Paper. To the best of our knowledge, this is the first time that both awards have been given to the same team!
  2. 16/10/2024: Our paper, “Efficient Multivariate Time Series Anomaly Detection Through Transfer Learning for Large-Scale Software Systems “, is accepted by ACM Transactions on Software Engineering and Methodology (CCF A)
  3. 16/10/2024: Our paper, “No More Data Silos: Unified Microservice Failure Diagnosis with Temporal Knowledge Graph”, is accepted by IEEE Transactions on Services Computing (CCF A)
  4. 08/09/2024: We get three papers accepted by ISSRE 2024 industry track (CCF B)
  5. 08/07/2024: We get one paper accepted and two papers conditionally accepted by ASE 2024 (CCF A)
  6. 08/05/2024: Professor Shenglin Zhang serves as the reviewer of KDD 2025 ADS Track
  7. 07/30/2024: Our paper, “LabelEase: A Semi-Automatic Tool for Efficient and Accurate Trace Labeling in Microservices”, is accepted by ISSRE 2024 (CCF B)
  8. 05/17/2024: Our paper, “Microservice Root Cause Analysis With Limited Observability Through Intervention Recognition in the Latent Space”, is accepted by KDD (CCF A)
  9. 05/05/2024: Professor Shenglin Zhang serves as the Artifact Evaluation Chair of ISSRE 2024. Welcome to submit your work to ISSRE 2024!
  10. 04/27/2024: Our paper, “Diagnosing Performance Issues for Large-Scale Microservice Systems with Heterogeneous Graph”, is accepted by TSC (CCF A)
  11. 04/19/2024: We get two papers, “Fault Diagnosis for Test Alarms in Microservices Through Multi-source Data” and “Illuminating the Gray Zone: Non-Intrusive Gray Failure Localization in Server Operating Systems”, are accepted by ESEC/FSE 2024 (CCF A)
  12. 01/23/2024: Our paper, “Supervised Fine-Tuning for Unsupervised KPI Anomaly Detection for Mobile Web Systems”, is accepted by WWW 2024 (CCF A)
  13. 10/12/2023: Our paper, “AutoKAD: Empowering KPI Anomaly Detection with Label-Free Deployment”, won the Best Research Paper Award in ISSRE 2023!
  14. 07/31/2023: Our paper, “Assess and Summarize: Improve Outage Understanding with Large Language Models”, is accepted by ESEC/FSE 2023 (CCF A)
  15. 07/30/2023: Three papers are accepted by ISSRE 2023 (CCF B).
  16. 06/29/2023: Our paper, “LogKG: Log Failure Diagnosis through Knowledge Graph,” is also accepted by IEEE TSC (CCF A).
  17. 06/14/2023: Our paper, “Robust Failure Diagnosis of Microservice System through Multimodal Data,” is accepted by IEEE TSC (CCF A).
  18. 05/17/2023: Our paper, “Robust Multimodal Failure Detection for Microservice Systems”, is accepted by ACM SIGKDD 2023 (CCF A).
  19. 05/16/2023: Prof. Shenglin Zhang will be the TPC member of IEEE ISSRE 2023 (CCF B). Welcome to submit your work to IEEE ISSRE 2023!
  20. 05/09/2023: Our paper, “Efficient Multivariate Time Series Anomaly Detection Through Transfer Learning for Large-Scale Web Services,” is accepted by ICWS 2023 (CCF B).
  21. 04/23/2023: Our paper, “Efficient and Robust KPI Outlier Detection for Large-Scale Datacenters”, is accepted by IEEE Transactions on Computers (CCF A).
  22. 02/27/2023: Prof. Shenglin Zhang will be the TPC member of IEEE ICNP 2023 (CCF B). Welcome to submit your work to IEEE ICNP 2023!
  23. 01/23/2023: Our paper, “CMDiagnostor: An Ambiguity-Aware Root Cause Localization Approach Based on Call Metric Data,” is accepted by WWW 23 (CCF A).
  24. 05/05/2022: Our paper, “Efficient KPI Anomaly Detection Through Transfer Learning for Large-Scale Web Services,” is accepted by IEEE JSAC (CCF A, Impact Factor: 9.144). Congratulations!
  25. 03/27/2022: Our paper, “Online Malicious Domain Name Detection with Partial Labels for Large-Scale Dependable Systems,” is accepted by The Journal of Systems & Software (CCF B, Impact Factor: 2.829). Congratulations!
  26. 02/28/2022: Dr. Shenglin Zhang will serve as the TPC member of IEEE ISSRE 2022 (CCF B). Welcome to submit your work to ISSRE 2022!
  27. 02/26/2022: Our paper, “Robust Anomaly Clue Localization of Multi-dimensional Derived Measure for Online Video Services,” is accepted by IEEE Transactions on Services Computing (CCF A, Impact Factor: 8.216). Congratulations!
  28. 02/20/2022: Dr. Shenglin Zhang will serve as the TPC member of IEEE/ACM IWQoS 2022 (CCF B). Welcome to submit your work to IWQoS 2022!
  29. 02/04/2022: Dr. Shenglin Zhang will serve as the TPC member of IEEE ICNP 2022 (CCF B). Welcome to submit your work to ICNP 2022!
  30. 01/14/2022: Our paper “Robust System Instance Clustering for Large-Scale Web Services” is accepted by WWW 22 (CCF A).
  31. Dr. Shenglin Zhang will serve as the TPC member of WWW 2022 (CCF A). Welcome to submit your work to WWW 2022!
  32. Our paper “Jump-Starting Multivariate Time Series Anomaly Detection for Online Service Systems” is accepted by USENIX ATC (CCF A). Congratulations!
  33. Our paper, “Detecting Outlier Machine Instances through Gaussian Mixture Variational Autoencoder with One Dimensional CNN”, is accepted by IEEE Transactions on Computers (CCF A). Congratulations!
  34. Dr. Shenglin Zhang will serve as the TPC member of ISSRE 2021. Welcome to submit your work to ISSRE 2021!
  35. Our paper “LogClass: Anomalous Log Identification andClassification with Partial Labels” is accepted by IEEE TNSM.
  36. Our papers “Cross-System Log Anomaly Detection for Software Systems” and “Unsupervised Detection of Microservice Trace Anomalies through Service-Level Deep Bayesian Networks” are accepted by IEEE ISSRE 2020.
  37. Our paper, “Localizing Failure Root Causes in a Microservice through Causality Inference”, is accepted by IEEE/ACM IWQoS 2020.
  38. Our paper, “Diagnosing Root Causes of Intermittent Slow Queries in Cloud Databases”, is accepted by VLDB 2020.
  39. Our paper, “Efficient and Robust Syslog Parsing for Network Devices in Datacenter Networks”, is accepted by IEEE Access (JCR Zone 2). Congratulations!
  40. Our work is reported by Nature Research in the article entitled “Finding order in a swirl of information”, which introduces College of Software, Nankai University.
  41. Our paper, ” LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs”, is accepted by IJCAI 2019 (CCF A). Congratulations to Ms. Yuqing Liu, who is in her junior year at College of Software, Nankai University now.
  42. Our paper, “Causal Analysis of the Unsatisfying Experience in Realtime Mobile Multiplayer Games in the Wild”, is accepted by IEEE ICME 2019 (CCF B). Congratulations!
  43. Our paper, “Robust and Rapid Adaption for Concept Drift in Software System Anomaly Detection”, of which Dr. Shenglin Zhang is the corresponding author, win the “Best Research Paper Award” at IEEE ISSRE conference (CCF B). Congratulations!

Visits: 151182