I spend a large amount of time looking at URLs. Most of that time is spent looking at the query string. Which caused me to completely miss a problem in WebInspect's URL normalization logic. Quick, spaces can be URL encoded as either + or %20 right? True, but only in the query string. So: http://site.com/foo.php?msg=billy+hoffman http://site.com/foo.php?msg=billy%20hoffman both normalize to the same things. When in the path or filename %20 is a space and + is a literal + character in the filename. So the urls: http://site.com/filename+with+spaces.html http://site.com/filename%20with%20spaces.html point to two different files on the webserver named "file+with+spaces.txt" and "filename with spaces.txt" respectively. If you are someone who has to handle URLs, remember that you cannot do a blanket "url = url.Replace("+", "%20") as part of your normalization logic. |