[Video]Building a Machine Learning Based Web Application Firewall/Intrusion Prevention System From Scratch
Recently I started to play around with some machine learning stuffs, and I decided to build a small project related to cyber security. So, I decided to build a Machine Learning based Web Application firewall.
Disclaimer: I’ve worked on web application security for a few years; however, this subject (machine learning) is completely new to me. Just because I make a video guides, doesn't mean I know everything, please feel free to correct me if I'm wrong at any point. Full series here :
I've used pycaret library to develop this IPS from scratch. Here is the summary of what it does
1. A proxy intercepts all HTTP request any server.
2. An web application security scanner is fired against a dummy web application
3. The scanner ran in two modes – Crawling Mode and Scanning Mode.
4. The Intercepting proxy logs all the HTTP request generated by the scanner. The crawling and scanning http log exported from the proxy.
5. A python script parse all http request logs and extracts several features form the raw request. Those features will be used to tarin the model.
6. The exported feature data then fed to kmean’s clustering model for training. We choose to create two clusters. One for good requests and one for bad request.
7. Once the model is trained its deployed and integrated with HTTP proxy in real time.
8. From the live data the IPS tries to detect if any request falls in good cluster and bad cluster and alert user.