Machine Learning for Security and Security for Machine Learning

Join the 2 days Expedition specially designed for security professionals to understand, build and hack Machine Learning applications. The course is divided into two parts, ML4SEC & SEC4ML. ML4SEC will focus on nitty-gritties of building ML applications. Then learn to hack them in SEC4ML part.

Machine learning / Deep learning is under exponential growth these days. Businesses, Academia and tech enthusiast are really hyped about trying out Deep learning to solve their problems. A lot of students, professionals and researchers are driven to learn this new cool tech. Just like every other technology, ML comes with awesome applications topped with some serious implications.

This course is aimed for security researchers/ penetration testers/ infosec enthusiasts to bridge their gap between Infosec and Machine learning. Considering no prior knowledge of mathematics and ML, we will try to build the intuition behind of Machine Learning methodologies. Attendees will go through the hands-on experience with building application like Firewalls, IDS/IPS, Malware Detection engines, etc. In-depth understanding of the entire ML pipeline is provided. Which consists of preprocessing data, building ML models, training and evaluating them and using trained models for prediction. Well known machine learning libraries like Tensorflow, Keras, Pytorch, Scikit learn, etc. will be used, providing an end-to-end and ready to apply ML Gyan for security professionals. Along with the applications, this course will address the vulnerabilities in state of the art machine learning methodologies. Lab material will consist of Vulnerable Machine Learning applications that can be exploited to provide a thorough understanding of observed vulnerabilities. Reasons behind the existence of these vulns and the defensive strategies will also be discussed.

Training Outline

This training is divided into two parts i.e. “ML for Security” and “Security for ML”. Considering no prior knowledge of mathematics and ML, we will try to build the intuition behind algorithms.

ML4SEC

Attendees will go through the hands-on experience in building ML powered defensive and offensive security. In-depth understanding of the entire ML pipeline is provided. Which consists of pre-processing data, building ML models, training and evaluating them and using trained models for prediction. Well known machine learning libraries like Tensorflow, Keras, Pytorch, sklearn, etc. will be used.

In this session, we will build up our understanding of basic yet state of the art machine learning algorithms. Discuss mathemagic behind why these models work the way they do. Build some smart Machine Learning applications and evaluate them. By the end, we will get an idea of how to solve a real-world problem using machine learning.

Introduction to Machine learning

Common use cases, where to use and where not to use machine learning
Introduction to different python libraries/packages like keras, tensorflow, sklearn
Overview of how machine learning models are built and deployed in production

Understanding Mathematics and intuition behind used machine learning algorithms

Supervised learning
Linear regression, logistic regression, Neural nets and similar classifiers
Unsupervised learning
Clustering algorithms like k-means
Semi-supervised learning

Brief introduction on data pre-processing with demo

Cooking a dataset so that it can be consumed by discussed models
Feature engineering: Decreasing the dimensionality of problem or adding more features to dataset
Removing unnecessary data and handling different data types
Dealing with incomplete data

Applications of machine learning in security domain with hands on examples

Detailed process of how to leverage previously discussed knowledge to build applications in defensive as well as offensive security.
Image classifier using deep learning
Defensive sec:
Web access firewalls
Intrusion detection systems
Malware detection engine
Offensive sec
Machine learning for phishing
Machine learning for fuzzing

Evaluate the built models using different evaluation parameters.

Now that we have made our systems “Intelligent”, is it possible to fool them? Are these applications hackable?

SEC4ML

This part will address the vulnerabilities (like Adversarial learning, Model stealing, Data poisoning, Model Inference, etc) in state of the art machine learning methodologies. Lab material will consist of Vulnerable Machine Learning applications that can be exploited to provide a thorough understanding of discussed vulnerabilities. Possible mitigation to these vulnerabilities will also be discussed.

In this session we will have a deeper look on different flaws in how ML/DL algorithms are implemented. Hands on examples explaining and attacking such vulnerable implementations. Also, discussion on possible mitigation.

Brief introduction to vulnerabilities in Machine Learning

Discussion on various ways of compromising machine learning apps

Adversarial learning Attacks

Introduction and mathematical intuition behind the existence of this flaw
Demo and hands on practice of fooling very accurate state-of-the art Image classifiers
Analysing why this attack works
Possible mitigation

Model stealing Attacks

How proprietary ML models can be stolen by attacker, making him/her to use the models for FREE
Stealing offline ML models that are deployed on device with installer packages
Stealing models that are deployed on cloud with restricted access via APIs
Demo

Model Skewing and data poisoning attacks

How and why this attack works
Hands on example of bypassing ML based 99.99% accurate Spam Filters
Possible Mitigation

Discussion on other lesser addressed vulnerabilities and real world impact.

CTF challenge focusing on one of the discussed vulnerabilities

What to expect

Thorough understanding of basic machine learning methodologies;
Hands on practice on Specially crafted labs for ML and Infosec enthusiasts;
End-to-end and ready to apply ML knowledge for security professionals;
Good understanding of Machine learning vulnerabilities;
Hands on experience with well known machine learning libraries;
Lab material for post-course practice.

Prerequisites

Basic knowledge of python is good to have but not required;
Basic of Linux and Virtualbox.

Requirements

Laptop with 8GB+ RAM;
30 GB space;
Virtual box (latest version);
Any flavour of Linux is preferred over windows;
Open mind made up for some intense mathemagic.

About the Speaker

Nikhil Joshi

Nikhil Joshi is a Security Researcher at Payatu. He has been the Machine Learning guy for more than 4 years and currently working on implementations of ML in offensive and defensive security products. At Payatu, He has orchestrated methodologies to pen-test Machine Learning application against ML specific vulnerabilities and loves to explore new ways to hack ML powered applications. Parallelly Nikhil’s research is focused on security implications in Deep Learning applications such as Adversarial Learning, Model stealing attacks, Data poisoning, etc.

Nikhil is an active member of local Data Science and Security groups and has delivered multiple talks and workshops. Also has spoken at HITB Amsterdam, PhDays Russia and presented his research at IEEE conference. He is a trainer at NullCon. Being an Applied Mathematics enthusiast, recent advances in Machine Learning and its applications in security, behavioural science and telecom are of major interest to Nikhil.