Content-Based Intrusion Detection System
Nobody ever broke into a bank’s IT system by cracking a user’s password. It’s not costeffective
to waste computer time on such a pursuit, for the sake of the few thousand dollars
that may, or may not be in the user’s account.
It’s far more cost-effective to persuade the bank to let you have access to its database, via a back door. Then, you have access to all of the bank’s resources, for the expenditure of a minimum of effort, and without even having to understand how the authentication system works.
On the other side of fence, when your company’s product actually is that bank’s authentication system, and which it describes as ‘Uncrackable’, you have to expect this to be like a red rag to a bull, as far as the world’s hackers are concerned.
Every day, dozens of them try to break the algorithm, but none ever succeed, so there is some excuse for the complacency which ensues. However, you soon notice that, for every front door attack, there are over a hundred attempts to totally bypass the authentication system, and get in via a back door.
Now, after you’ve told the world that the authentication system is uncrackable, it would be rather embarrassing to find that the hackers had decided not to bother cracking it, but had broken into your authentication server, instead, and hijacked your database.
You have no control over how the average bank, securities trading company or whoever uses your product, configures their online access server or ATM machine, but you can lead by example, and make sure that your authentication server, at least, can be made hack-proof.
Easy, right? All you need to do is to buy a device which will alert you, as soon as it detects a hack attempt, and prevent it succeeding.
If, after a few weeks of searching on the internet, and talking to prospective suppliers, you find that nothing on the market will do what you want, what do you do?
You write your own, of course…
Defining the problem
When we set up the infrastructure for our authentication server’s website, we did all the right things.
The only open port was port 80, there was no GET permission for cgi-bin, no POST permission for htdocs, all other methods like MOVE, DELETE, COPY etc were disabled, and there were no interpreted scripts, like those written in java, perl, shell or ruby.
The only HTML page was index.html, and the other sixty four pages were dynamically created by the CGI — which was an executable, written in a compiled language. That way, if a hacker ran Wget on our site, he’d have no additional clues as to which page called which CGI, or what any of the HTML variables meant.
As far as it went, it certainly was. We had many connections each day, from the usual hopeful hackers, who would try to get in by breaking the authentication algorithm, and from the old-timers and incompetents, who would try buffer overflow, not having heard that that particular method didn’t work on modern network applications
Then, after a few months, things changed, as dozens of more determined hackers, with no life of their own, decided that they could combine distributed denial of service attacks with hack attempts. We were inundated with hundreds of queries, each designed to plant or exploit back doors, inject SQL or exploit vulnerabilities in every file whose name ended in ‘.php’.
We don’t use WordPress, cPanel, Joomla, ccmail or any of the other traditionally exploited software
packages, so we were immune to all of these attacks, but it was extremely annoying to watch the server logs scrolling like a Las Vegas slot machine, as every unimaginative hack script repeated the same dumb vector anything from two to four hundred times.
Also, it was eating up our network bandwidth, and making the site respond less quickly than we would have liked, and giving perverted pleasure to some hacker, who was watching hundreds of lines of hack script execute.
The last straw came, one day, when we were hit with a DDOS from an address in the Netherlands. It started about 4am, and continued till 11am, during which time the hacker had thrown over twenty thousand vectors at us, at which point, I manually added a firewall rule to block his IP address.
The hacker continued to bang his head against the firewall till around lunch time, on every port from 1024 to 32767, and then gave up. The only positive outcome of this was that, during the attack, all of the other hackers were blocked by the limited remaining bandwidth.
It was obvious that something positive had to be done to stop this nonsense.
We decided to find an intrusion detection system which, everyone agreed, would solve our problem, and made a list of the functions we wanted it to perform.
First, it had to be content-based, so it could identify a hack attempt by the kind of thing the query was trying to do, which implied that such a system would need a certain amount of intelligence.
Second, having identified the hack, it would need to remember the IP address, drop the connection, and make sure that that IP address would never again be allowed to connect to our site.
Last, it would need to do all this in less than one second. The attacks that we faced were not directed from Mum and Dad’s Wintel PC, but from high-end Unix servers in data centres. Having seen the speed at which our log monitor scrolled up the screen when we were under attack, we then examined the access log, and noticed that the average zombie hijacked server could shower us with hack vectors at a minimum rate of two or three a second and, sometimes, if they’d hacked a decent machine, up to ten a second.
Our goal was to stop it after the first vector.
The search for the product
What we expected was, that we would make a quick list of suitable products, then spend a long period of decision-making, choosing from many suitable candidates. This turned out to be a huge disappointment.
We noticed from the first day, that the vast majority of intrusion detection systems were really no more than fancy java, shell and perl scripts, with a response time similar to that of a whale trying to turn itself around.
Disillusioned with the (not so) cheap end of the market, we decided that something used by banks had to be of the right quality, so we took a look at the professional, so-called ‘enterprise level’ products.
While researching this kind of product online, the whole thing got off to an unpromising start, when I read the comments of a security consultant to a bank, describing the product they used. During his speech, he declared proudly, that they would be aware of an intrusion within forty-eight hours of its happening.
Forty-eight hours? To us, forty-eight seconds would be too long, never mind forty-eight hours.
Predictably enough, the search of the high end of the market showed that shell scripts could be available at high prices, too.
Worse, most of this stuff only ran on Windows, and we’re a Sun Solaris shop. Who, in his right mind, would run a website on Windows?
During the demo of one of these products, the salesman explained that his system took its data via a network connection to the actual web server machine, and it had this absolutely mind-blowing graphical display of how your website was being hacked, minute by minute. This was impressive, and a lot easier than watching lines of text scrolling up the screen.
We asked how it worked, and were told that it counted the number of queries received in a given period and, if that exceeded a given value (which we could preset, of course) it flashed a lot of lights on the panel, and sounded an important alarm bell. Yes, but how did it differentiate between a legitimate connection, which just happened to be from a particularly fast machine, and a hack script? Well, it didn’t, but the final decision would be up to its operator. Did that mean that it didn’t automatically cut off the incoming connection? That’s correct. The system administrator would have to do that.
The salesman explained, rather frostily, that what we wanted was an intrusion protection system, not an intrusion detection system.
Since his product was totally unaware of the content of each query, we rejected it, and took a look at another, which claimed to be content-aware. This was more promising, since it was possible to pre-program the thing with a selection from a set of internally stored, popular hack strings, and have it do the usual light flashing and frantic beeping when it discovered something interesting.
Although it ran on Linux, and a source code licence was available (at an additional cost), so that we could recompile it to run on Solaris, it, too, relied on the system administrator to do something about the hacker. Furthermore, there was no provision for adding new hack strings to the list hard-coded inside it.
Further questioning revealed that the thing ran like a packet sniffer, and reassembled each packet’s
payload to figure out the query string. This procedure resulted in many false positives, and false negatives, and made its response time less than breathtaking.
The only product which, apparently, did what we wanted was a proxy. Filled with new enthusiasm, we took a cautious look at a few proxy offerings, only to be further disappointed.
Although a proxy really could do content-based filtering, accessing our web pages through it proved to be virtually impossible. Also, the degree of remote control available was strictly limited, to the point of being unusable.
So, there it was. The market was willing to sell us a few Linux offerings, a huge number of Windows
ornaments, but nothing that would examine the content of what was trying to get into our website, and automatically drop the connection, if it saw something it didn’t like.
During the time we were talking to the representatives of the various intrusion detection companies, was thinking about the various issues which surrounded this problem.
Firstly, we absolutely needed to know the content of each query. However, there was no way that we would accomplish this reliably, by tracking text strings across several hundred packets, and then reassembling the original query. The packet stream contained too much information, some of which was irrelevant, and identifying the query with any degree of certainty was too difficult.
What we needed was a pre-assembled query, which was guaranteed to be a query.
In the apache access log monitor, in the middle of a botnet attack, watching it scroll enthusiastically up the screen, when it occurred to , that what happened at packet level was totally irrelevant. Bad things would only happen at the time when apache had the whole query in its buffer, and was
about to act on it. Therefore, if we read the access log as it was being created, we would only be one line behind apache.
The hack queries themselves didn’t bother us, since they attempted to exploit vulnerabilities in software we didn’t use, so allowing one to slip through would be of no consequence.
So, the only criterion was to identify the first in a series of malicious queries from a given address, and do something before the hacker could send a second query.
Now the question arose, as to what constituted a malicious hack?
So, what is a hack vector?
We filtered our access log, and removed all queries which accessed our legitimate web pages and CGI executables. What remained, according to Sherlock Holmes, had to be the truth.
A lengthy and detailed examination of the logs showed a rich selection of attempted hacks.
One that was extremely prolific, was a GET followed by a series of ‘../../..’ of varying lengths, terminating in some significant filename, like /etc/passwd. This would have to be the first on our list, since so many hackers, with no lnowledge of Unix, thought it had some chance of succeeding.
Next, we noticed blocks of up to a thousand hexadecimal characters, each preceded by a percent sign.
Decoding these, revealed that they were either IP addresses, or filenames, which some incompetent hacker assumed would slip past the casual observer. This hack’s secondary function was an attempted buffer overflow, caused by its sheer length. A definite second choice for blocking.
Almost identical in purpose, was a similar hack, but with the p ercent sign replaced with ‘\x’. However, the hexadecimal values weren’t ASCII.
This was a puzzle, which took a lot of research, until I recognized one of the hexadecimal values as
being the Intel processor opcode for ‘CALL subroutine’. Hackers call these things shellcodes, and the
intent is to execute a buffer overflow, so they can place Intel machine code in the system’s RAM, take
over the CPU’s program counter, index it to point to their own code, and execute anything they like on your machine. For any machine running on an Intel CPU, this would be the kiss of death.
With so many ‘\x’ characters, this hack was easy to identify, so we added it to the list.
Then, there was the embedded question mark, usually followed by what looked like a script of some kind. and the embedded exclamation mark, usually in the middle of a lot of different hexadecimal stuff, which was obviously up to no good.
There were also the SQL injection hack attempts. I guess the most original, was one which attempted to overflow the CAPTCHA buffer (which we didn’t use) with a script like this:
We decided against wasting computing time on these, since our other criteria, such as the string ‘.php’ and the percent signs, would easily identify it. Finally, there were quite a few hacks containing an embedded series of plus signs, usually accompanied by a string of hexadecimal, or plain text like ‘Result:+no+post+sen ding+forms+are+found’. Just for the sake of complete coverage, we added a line of code to reject these.
Collating all of the information revealed something even more interesting. It became obvious that
approximately ninety percent of all hack attempts of all kinds were aimed at dozens of different PHP files.
The attacks varied from simple GET queries and POST queries, to a pattern, where an initial query would attempt to GET a file like index.php (presumably, to establish its existence) and be followed by a second query, which would try to POST to the same file, and overwrite it with a back door. Then, a third query would try another GET.
In the light of these observations, we decided that another primary candidate for blocking would be any query containing the string ‘.php’.
Command and Control
What happens once the malware is installed on your computer?
Since Unix, unlike Windows, doesn’t permit self-executing executables, the hacker needs to access his malware after it has been installed.
How is he going to do so? Any self-respective server will have all ports closed except port 80, and believe itself to be totally impregnable. Unfortunately, this is not the case, since it is through port 80 that the C&C will wake up and direct the malware.
Almost all security devices concentrate on monitoring and defending TCP traffic through port 80. The C&C, on the other hand, talks to the malware using the UDP protocol, also through port 80, and is invisible to apache, and to many security systems.
It’s perfectly reasonable to block UDP traffic, with few resulting issues. However, just to complicate matters, there are other services, which run on UDP. DNS queries and replies, the Unix XDMCP login, and time server data are just a few examples. Any firewall rule which blocks UDP traffic, has to exclude these.
Dropping the connection
When we reached this point in the investigation, could almost write the code for the content analyzer in my head, and it was beginning to look more and more possible that we could write our own intrusion detection system. Then, thought about the tricky part: dropping the connection.
The first thing to come to mind was a utility called tcpkill, which will very nicely drop an established TCP connection. However, a moment’s reflection showed that this would be inadequate. The average hack script re-sent the same line anything up to four hundred times and, if we invoked tcpkill every time, not only would the network traffic be no lighter, but the CPU would chase its tail trying to keep up with the repeated hack attempts as well, especially when handling an attack from a few dozen servers simultaneously.
The next thought was that we would use the firewall.
Since we expected that our IDS would be a stand-alone process, it would be necessary to use a firewall which was remotely programmable. Almost every supplier that we contacted claimed to have such a device, so things were looking very promising.
Unfortunately, firewalls are very security-conscious animals, and the only way to remotely program them, is to login to them first. The procedure for doing this was either through a gee-whiz graphical user interface, or via a telnet or SSH TCP connection. The GUI was obviously unacceptable, so we wrote piece of code which established a telnet connection to the firewall and sent it a new rule. Ten seconds later, it was back on line.
Most firewalls contain a minimal Linux computer, and every time a new rule is added, this computer is rebooted. Even though ten seconds is a very short time for a reboot, it was just too long for our purposes.
Apart from the huge delay to add another rule, during that ten seconds, the machine would be sitting there with open arms, welcoming all hackers to do their worst, since the firewall was resetting itself, and totally inoperative. Further, that ten seconds would allow several hundred new hack attempts to queue up for processing, resulting in a never-ending shuffle between our content analyzer and the firewall.
Firewalls were abandoned, and we turned our attention to the Unix operating system.
Solaris has am extremely powerful utility, called ‘ipf’, which is a version of the ‘ipfilter’ module, which dates back to SunOS 4.1.3, in the good old BSD days.
It has all of the facilities available in stand-alone firewalls, such as NAT, but the filtering is actually
performed in the Unix kernel, making it extremely efficient. It gets its rule set from a file, which is a minor drawback, but decided to try it, anyway.
Wrote another piece of code, which appended a new firewall rule to the file, then told ipf to re-read the file and restart. We ran a few tests, and found that the time delay was almost immeasurable.
This is actually not that surprising. Since the filtering is done in the kernel, there is no actual ipf process. When a user issues a command to re-read the configuration file, the kernel activates a ‘read’ system call, which is internal to itself, so there isn’t even a separate process to re-spawn. The only delay, is the time taken to execute the disk I/O — which is always a high priority task, since the kernel knows it takes a long time.
We decided against including any facility to count the number of queries in a given time interval. If the purpose was to identify a DDOS, then the hardware firewall could adequately cope with it. Also, this would mean repeatedly stopping other processing for the duration of that time interval. This could add an order of magnitude to the response time.
The complete system
We now had all of the building blocks for a complete intrusion detection — or, more accurately, intrusion protection system.
On startup, the IDS would read the ipf configuration file, and store all of the rules in an array of data
structures. This would put the IDS in sync with the firewall, which was necessary, so that we didn’t try to add a rule for an IP address which was already being blocked.
Next, we called a function which opened the apache access log, and performed a seek to the last line in the file. Having done that, it entered an endless loop, and waited for another line to be added to the file.
The loop contained the code of the content analyser, and had no time delays or pauses built in, so it would execute as fast as the CPU could execute machine code. This is usually extremely bad practice, since it uses 100% of the CPU’s processing power. In our case, it didn’t matter, since our machine had 32 CPU’s, and devoting one of them to the IDS was a good investment.
As soon as apache logged another query, the content analyzer would scan it to see if it contained any of the hack signatures which we had built into it. If a hack attempt was identified, a firewall rule would be automatically created, to make comparison with the stored firewall rules easier, then another function would be called, to see if that IP address was already being blocked.
If this turned out to be a new hack attempt, the ipf configuration file would be opened, the new rule
appended, and ipf re-invoked so it could re-read the file. Having performed the most important operations, the IDS would then add the new rule to its internal store.
There is a great temptation, when designing a system like this, to use multi-threading, or parallel processing. Although this would have considerably speeded up part of the processing cycle, the dangers of collision, between threads or processes, in areas such as file reading or writing was too great. Semaphores and mutexes are traditionally used to obviate such problems but, in general, if you need to use a mutex, you’re either doing it wrong, or you shouldn’t be multi-threading.
After a short period of debugging, the IDS was commissioned, and we monitored its progress over the first week, or so.
The performance was even better than we expected, and there were no false negatives. Anything that was supposed to be stopped, was stopped dead, after apache received just one illegal query.
However, there were a number of false positives.
We examined the logs, and found that some query strings, especially those which were links from some online magazines, and some social media sites contained elements containing the string ‘.php’. This was enough to trip the content analyzer, and have the IP address blocked.
There were so few of these false positives, that we were willing to write this off as acceptable collateral damage, when compared with the enormous benefit of limiting each hacker to one hack attempt per lifetime. With the possible exception of LinkedIn, the social media sites were unlikely to bring us any significant business, but some of the online magazines were important. Accordingly, we added a few lines of code into the loop, which would cause it to ignore any positives containing the names of chosen sites.
So far, the IDS has been running continuously for over two years, with no modifications, apart from the periodic addition of new rules, as new hacks are discovered.
If this were a commercial product, we would probably have it read a configuration file on startup, instead of having the hacks and exceptions hard-coded. However, since it isn’t, we don’t mind recompiling it each time there’s an update. It keeps it more secure.
Originally published at https://learncybersec.blogspot.com.