ML, Cybersecurity, CAN, ISO 11898, IDS, Automotive
Developing an Intrusion detection System (IDS) for Control Area Networks (CAN) using Machine Learning based on Real Life Traffic Datasets
Abstract
Modern vehicles rely on Controller Area Networks (CAN) buses for in-vehicle communication. CAN protocol is reliable and safe, but not a secure communication protocol, making it vulnerable to security threats. This thesis presents a modular Intrusion Detection System (IDS) architecture using Machine Learning (ML) based on real-world traffic datasets consisting of two modules: Detector and Classifier. It was identified that in real-world scenarios, the attack traffic is scarce when compared to benign traffic – making it difficult for ML algorithms to learn attack patterns for detection and classification. The proposed IDS architecture implements sliding windows to capture the time dependency of attack traffic to detect and classify attack patterns in CAN traffic. This study uses a reference dataset containing data collected in real-time from vehicle manufacturers Chevrolet and Subaru. It evaluates the performance of the IDS to observe how effectively it detects an attack and classifies it to the correct attack type, among many, such as denial of service, spoofing, fuzzing attacks, etc. It also evaluates the proposed IDS on its consistency and generalisation in detecting attacks over different vehicles CAN traffic and varying attack types in the reference dataset. This thesis also emphasises the use of sliding windows implementation during training, about how it affects the performance of the IDS. It was observed that the proposed IDS was effective in detecting the attacks with up to 94% true positive rate, and partially robust in classifying the attacks into the correct categories, only classifying the attacks consistently with known vehicles, and attack types that the IDS has seen during training. We also provide a modular architecture that can be expanded and further improved upon for future work.
Keywords: Controller Area Networks, CAN, Intrusion Detection System, Machine Learning, Real-World Datasets, Classifier, Detector, Chevrolet, Subaru
Dataset Used
can-train-and-test: A curated CAN dataset for automotive intrusion detection https://doi.org/10.1016/j.cose.2024.103777
- 91,827,504 benign samples; 74,183,508 attack samples; total 166,011,012 samples.
- 11 types of attacks: DoS, Gear spoofing, Interval, RPM spoofing, Speed spoofing, Standstill, Systematic, Suppress, and Masquerade.
- 4 different vehicles and 6 drivers of various ages and genders.
- Data (benign & attack) collected live, on-the-road from OBD (on-board diagnostics) port.
- High severity attacks on RPM, Speed, gear (double and triple signal modification).
- Labeled dataset.
What I Learned
Through this project, I learned how to work independently, effectively managing my responsibilities as a working student while balancing university studies along with being dedicated to the project. This experience taught me to be analytical, accountable and reliable, ensuring that I could meet all deadlines and deliver high-quality work despite the demands of multiple commitments.


