DiagSoftfailure: Automated Soft-Failure Diagnostic Tool Using Machine Learning for Network Users
As increasing individuals and organizations move their activities and services online, network performance problems resulting in slow data communication speed become a significant obstacle for satisfactory user experience. Currently, there is a lack of a fully automated tool that can help network users find the complicated network problems that degrade the performance of network applications. DiagSoftfailure (Automated Soft-Failure Diagnostic Tool) fills this gap by promoting machine learning technology for Network Users to infer the location and root cause of network failures that result in performance degradation.
Upon the user's request via the web browser, DiagSoftfailure server deployed at the border router between the campus/enterprise LAN and the backbone network collects and analyzes the packet trace corresponding to the target application for soft-failure diagnosis. It first utilizes open-source TCP trace analysis software (i.e., libpcap and tcptrace) to obtain the raw features of the network behavior. Then, those raw features are further processed to extract network signatures that can provide sufficient distinction for an effective and reliable diagnosis. Based on the network signature, automated classifiers trained by combining supervised and semi-supervised machine learning are used to identify both known and unknown soft-failures in the network. Finally, the DiagSoftfailure server sends the network user a diagnosis report, which allows novice and expert users to view and understand the network condition and problems.
DiagSoftfailure provides the capabilities for automated network soft-failure diagnosis with the following features:
- A user-focused diagnosis requires no cooperation with the network manager
- An adaptive network signature that is robust against data inconsistency and high-dimensionality of network behavior data ensures high diagnosis accuracy
- Capable of identifying unknown faults by combining supervised and unsupervised machine learning
- Requires no changes in the OS system kernel and allows implementation flexibility
- A diagnosis report groups test results into different categories in a comprehensive format and can be understood by novice users
All these features are useful for quickly and easily identifying a specific set of conditions that impact the network performance. It can help users find the root cause of the network performance degradation. It can assist the user and network administrator in rapidly resolving the network problem and improving connection speeds and alleviating user dissatisfaction while reducing the network administrative cost.