When I've thought about application fingerprinting in the past its fallen into 2 categories: Passive: Detect applications by examining a streams of data that was generated for some other purpose. Examples include: Banner Grabbing or regexing pages for certain phrases like "Powered By" or <META> Generator tags. Active: Probe the app for certain, unlinked files. Fingerprinting can be done with detecting the presence of files, hashing their contents, or regexing for specific identifiers. The JavaScript port scan I developed back at SPI used the presence of files to fingerprint, while Nikto's Favicon fingerprinting uses MD5s of /favicon.ico, and Backend Info uses file probes + regexs. This is an interesting paper the discusses a new(ish) way to passively fingerprint web applications: Link Structure and Forms. I say newish because while at SPI/HP we would often uses Regexs to examine hyperlinks or CSS/JS includes to roughtly detect apps. This was more of a coarse "should I try and send this attack" filter and not a "this page is definitely running phpXYZ version 1.2.3" detection. Essentially, this paper dicusses using the common and repeated structure of links and their parameters, as well as forms and there inputs/types to create signatures for applications. The results are pretty impressive, and I like that its passive! A Method of Identifying Web Applications |