googless - sector 2018 · resolved in owasp testing guide v3. owasp 28 owasp testing guide v3...
TRANSCRIPT
Copyright © The OWASP FoundationPermission is granted to copy, distribute and/or modify this document under the terms of the OWASP License.
The OWASP Foundation
SecTor 2008
http://www.owasp.org
googless
Christian Heinrich aka cmlhOWASP “Google Hacking” Project Lead
[email protected]://www.linkedin.com/in/ChristianHeinrich
2OWASP
Copyright Notice
Slides and Notes Licensed as:
� AU Creative Commons 2.5� Attribution-Non Commercial-No Derivative Works
3OWASP
Updates to PowerPoint Slides
Incorporates all previous slides from:
� ACM Computer Security Day 2007
� OWASP AU Conference 2008
� OWASP USA Conference 2008
� ToorCon X
� SecTor 2K8
Lasted Updated 19 October 2008
5OWASP
If You Download these Slides Later?
View->Notes Page has references and further info
Some slides are hidden due to time limit
6OWASP
About cmlh
Over 12 Years of “End User” Experience:
�Security Thought Leader within AU Media�Currently Largest AU Cable Broadcaster
� Governance (i.e. PCI DSS v1.2, ISO 27001, etc)
� Windows and UNIX Technologies
� Media Specific i.e. Digital Rights Management (DRM)
�Past NSW Limited (Part of News Corporation)
http://www.linkedin.com/in/ChristianHeinrich
7OWASP
About cmlh
Over 12 Years of “End User” Experience:
�Past .gov.au�DSD Certified Gateway Service Provider
� ASIO Web Hosting
�Government Endorsed Business (GEB)
�Past State .nsw.gov.au�Critical Infrastructure (Utilities – Electricity)
http://www.linkedin.com/in/ChristianHeinrich
8OWASP
Contributions as OWASP “Google Hacking” Project Lead
�OWASP Testing Guide v3
�4.2.1 “Spiders/Robots/Crawlers”
�4.2.2 “Search Engine Reconnaissance”
�Proof of Concepts (PoC)
� “TCP Input Text”
� “Download Indexed Cache”
� “Speak English” Google Translate Workaround
�Presented at OWASP AU and USA Conferences 2008
Copyright © The OWASP FoundationPermission is granted to copy, distribute and/or modify this document under the terms of the OWASP License.
The OWASP Foundation
SecTor 2008
http://www.owasp.org
OWASP Testing Guide v34.2.1 “Spiders/Robots/Crawlers”
Christian Heinrich aka cmlhOWASP “Google Hacking” Project Lead
[email protected]://www.linkedin.com/in/ChristianHeinrich
10OWASP
What is a Spider/Robot/Crawler?
1. Automatically Traverses Hyperlink[s]
2. Recursively Retrieves URLs Referenced
Differentiators Between Implementations Include:
� Apply Heuristics to Selection of Hyperlink[s]
� HTTP GET Request at Random Time Interval
11OWASP
Robots Exclusion Protocol
Expected Behaviour Dictated by Web Server
� Permit/Deny Indexing, Caching, etc� Applies to a specific robot implementation
Specified by Two (2) Controls:
1. <META> tags within HTML <HEAD>
2. robots.txt within Web Root Directory
12OWASP
<META> Tags
<META NAME=“robots” [snip]
�Applies to all robot implementations
<META NAME=“googlebot” [snip]
�Applies to Googlebot
13OWASP
<META> Tags
[snip] content=“noimageindex”>
�Allows Indexing but Excludes Any Images
[snip] content=“noindex”>
�Prevents Indexing but Hyperlinks[s] Followed
14OWASP
<META> Tags
[snip] content=“noindex,nofollow”>
�Prevents Indexing and Following Hyperlink[s]
[snip] content=“noarchive">
�Removes the “Cached” Version of the Page
15OWASP
robots.txt
Located in Web Root Directory
cmlh$ wget http://www.google.com/robots.txt
--23:59:24-- http://www.google.com/robots.txt
=> 'robots.txt'
Resolving www.google.com... 74.125.19.103, 74.125.19.104, 74.125.19.147, ...
Connecting to www.google.com|74.125.19.103|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
[ <=> ] 3,425 --.--K/s
23:59:26 (13.67MB/s) - 'robots.txt' saved [3425]
16OWASP
User-agent: Directive
User-agent: *Allow: /searchhistory/Disallow: /news?output=xhtml&Allow: /news?output=xhtmlDisallow: /searchDisallow: /groupsDisallow: /images…
“*” Applies to all Spiders/Robots/Crawlers
17OWASP
User-agent: Directive
User-agent: GooglebotAllow: /searchhistory/Disallow: /news?output=xhtml&Allow: /news?output=xhtmlDisallow: /searchDisallow: /groupsDisallow: /images…
“Googlebot” Applies to “Googlebot” Crawler
18OWASP
Disallow: Directive
User-agent: GooglebotAllow: /searchhistory/Disallow: /news?output=xhtml&Allow: /news?output=xhtmlDisallow: /searchDisallow: /groupsDisallow: /images…
Can be intentionally ignored:
�Not for Access Control within Web Server
�Not for Digital Rights Management (DRM)
19OWASP
Testing robots.txt
1. Sign into Google Webmaster Tools
2. On the Dashboard, click the URL
3. Click “Tools”
4. Click “Analyze robots.txt”
20OWASP
Recommendations
Implements robots.txt as Primary Control
�Consider <META> as Secondary Control
�Supported by Minority of Spiders/Robots/Crawlers
Reference All Spiders/Robots/Crawlers
�User-agent: * within robots.txt
�<META NAME=“robots” …
21OWASP
Recommendations
Avoid Regular Expressions
�Supported by Minority of Robot Implementations
�Specify [0-9] Over 10 Individual Lines
Implement Other Technologies:
�DRM for Rights Management of Web Content
�Access Control within Web Server
Copyright © The OWASP FoundationPermission is granted to copy, distribute and/or modify this document under the terms of the OWASP License.
The OWASP Foundation
SecTor 2008
http://www.owasp.org
OWASP Testing Guide v34.2.2 “Search Engine Discovery”
Christian Heinrich aka cmlhOWASP “Google Hacking” Project Lead
[email protected]://www.linkedin.com/in/ChristianHeinrich
<date>
23OWASP
How Google Indexes My Site:?
�Crawled by “Googlebot”:
�URLs Crawled Previously by “Googlebot”
�Sitemap provided by Webmaster
�Indexed by <TITLE> and <ALT> Attributes
�Retained in Google Cache
�Repeat
24OWASP
Residual Risk of “Google Hacking”
Your site: Published by Google Cache
�Due to Your Webmaster’s Fault
�Google provides “Page Removal”
Advantages to an Attacker:
�“Sort of” Anonymous Attack Surface Mapping
�Page Rank Orders “Less Public” Results Last
25OWASP
History of “Google Hacking”
August 2001
� “Against the System: Rise of the Robots”
�Michal Zalewski (Bindview)
�Phrack #57
November 2001
� “How to use Google to find confidential informations”
�Vincent Gaillot
�BUGTRAQ
26OWASP
GHDB Methodology
Based on Individual Google Search Queries:
�Microsoft Remote Desktop Web Connection� intitle:Remote.Desktop.Web.Connection inurl: tsweb
�Virtual Network Computer (VNC)� “VNC Desktop” inurl:5800
�Outlook Web Access� inurl:”exchange/logon.asp”
� intitle:”Microsoft Outlook Web Access – Logon”
27OWASP
Issues with GHDB Methodology
Not Targeted:
�Vulnerabilities Outside of site:
�Excludes site: Information Leakage
“We’re Sorry”
�Implemented to Stop Worm Propagation
Resolved in OWASP Testing Guide v3
28OWASP
OWASP Testing Guide v3 Methodology
Based on Google Advanced Search Operators:
� site:
� cache:
29OWASP
Google Advanced Search site: Operator
Targeted to a Specific site:
site:www.google.com
�Targeted at www.google.com only
site:google.com
Targeted at all hostnames and subdomains
� www.google.com
� video.google.com, etc
31OWASP
Google Advanced Search cache: Operator
Display Indexed Web Page in Google Cache
�For Example Click “Cached”
32OWASP
Google Advanced Search cache: Operator
Display Indexed Web Page in Google Cache
�For Example cache:www.owasp.org
33OWASP
Urgent Page Removal
1. Sign into “Google Webmaster Tools”
2. Click the URL
3. Click “Tools”
4. Click “Remove URLs”
Copyright © The OWASP FoundationPermission is granted to copy, distribute and/or modify this document under the terms of the OWASP License.
The OWASP Foundation
SecTor 2008
http://www.owasp.org
OWASP “Google Hacking” ProjectTCP Input Text
Christian Heinrich aka cmlhOWASP “Google Hacking” Project Lead
[email protected]://www.linkedin.com/in/ChristianHeinrich
<date>
35OWASP
TCP Input Text
cmlh$ ./tit.pl
“TCP Input Text" Proof of Concept (PoC) 0.1
Copyright 2008 Christian HeinrichLicensed under the Apache License, Version 2.0
1. glcfapp.umiacs.umd.edu TCP/8080 available2. www.speedguide.net TCP/8080 available3. www.wsu.edu TCP/8080 available4. inside.c-spanarchives.org TCP/8080 available5. torrents.freebsd.org TCP/8080 available6. sammelpunkt.philo.at TCP/8080 available7. arc.cs.odu.edu TCP/8080 available8. www.ripn.net TCP/8080 available9. phy043.tours.inra.fr TCP/8080 available10. 202.188.95.52 TCP/8080 available
Writing tit.csvWriting tit_nc.shWriting tit_nmap.sh
cmlh$
37OWASP
doGoogleSearchResponse URL
=~m|(\w+)://([^/:]+)(:\d+)?/(.*)|;
my $Protocol = $1;
my $Domain_Name = $2;
my $URI = ("/" . $4);
if ($3 =~ /:(\d+)/) {$TCP_Port = $1}
else {$TCP_Port = 80}
38OWASP
tit.csv
glcfapp.umiacs.umd.edu,8080
www.speedguide.net,8080
inside.c-spanarchives.org,8080
www.wsu.edu,8080
torrents.freebsd.org,8080
sammelpunkt.philo.at,8080
arc.cs.odu.edu,8080
www.ripn.net,8080
phy043.tours.inra.fr,8080
202.188.95.52,8080
39OWASP
tit_nc.sh
nc -vz glcfapp.umiacs.umd.edu 8080
nc -vz www.speedguide.net 8080
nc -vz inside.c-spanarchives.org 8080
nc -vz www.wsu.edu 8080
nc -vz torrents.freebsd.org 8080
nc -vz sammelpunkt.philo.at 8080
nc -vz arc.cs.odu.edu 8080
nc -vz www.ripn.net 8080
nc -vz phy043.tours.inra.fr 8080
nc -vz 202.188.95.52 8080
40OWASP
tit_nmap.sh
nmap -PN -sT -p T:8080 glcfapp.[snip]
nmap -PN -sT -p T:8080 www.spee[snip]
nmap -PN -sT -p T:8080 inside.c[snip]
nmap -PN -sT -p T:8080 www.wsu.edu
nmap -PN -sT -p T:8080 torrents[snip]
nmap -PN -sT -p T:8080 sammelpu[snip]
nmap -PN -sT -p T:8080 arc.cs.odu.edu
nmap -PN -sT -p T:8080 www.ripn.net
nmap -PN -sT -p T:8080 phy043.t[snip]
nmap -PN -sT -p T:8080 202.188.95.52
Copyright © The OWASP FoundationPermission is granted to copy, distribute and/or modify this document under the terms of the OWASP License.
The OWASP Foundation
SecTor 2008
http://www.owasp.org
OWASP “Google Hacking” ProjectDownload Indexed Cache
Christian Heinrich aka cmlhOWASP “Google Hacking” Project Lead
[email protected]://www.linkedin.com/in/ChristianHeinrich
<date>
42OWASP
Download Indexed Cache
cmlh$ ./dic.pl
"Download Indexed Cache" Proof of Concept (PoC) 0.1
Copyright 2008 Christian HeinrichLicensed under the Apache License, Version 2.0
1. http://video.google.com/ 34k in Google Cache2. http://adwords.google.com/ 24k in Google Cache3. http://www.google.com/ig 72k in Google Cache4. http://partnerpage.google.com/ 62k in Google Cache5. http://knol.google.com/ 23k in Google Cache6. https://www.google.com/ig 72k in Google Cache7. http://finance.google.com/ 59k in Google Cache8. http://www.google.com/friendconnect/ 14k in Google Cache9. http://www.google.com/mobile/ 11k in Google Cache10. http://www.google.com/analytics/ 14k in Google Cache
cmlh$
43OWASP
Download Indexed Cache
cmlh$ cd google.com/video
cmlh$ head –n 25 cachedPage.html
<meta http-equiv="Content-Type" content="text/html;charset=US-ASCII"> <base href="http://video.google.com/"><div style="margin:-1px -1px 0;padding:0;border:1px solid #999;background:#fff"><div style="<a href="http://www.googl[snip]
45OWASP
Google SOAP Search API
doGetCachedPage
� $key
� $URL
doGetCachedPageResponse
� … xsi:type="ns2:base64">
Copyright © The OWASP FoundationPermission is granted to copy, distribute and/or modify this document under the terms of the OWASP License.
The OWASP Foundation
SecTor 2008
http://www.owasp.org
tit.pl and dic.pl Roadmap
Christian Heinrich aka cmlhOWASP “Google Hacking” Project Lead
[email protected]://www.linkedin.com/in/ChristianHeinrich
<date>
47OWASP
OWASP Alpha Project Review
Current Reviewers:
�pdp (@GNUCITIZEN)
�Chris Gates (@Carnal0wnage)
�Glenn Roberts (@Solutionary)
Please contact me if you want to assist in review?
48OWASP
Public Release
At RUXCON 2K8 (Late November 2008)
Check In at code.google.com after RUXCON 2K8
Copyright © The OWASP FoundationPermission is granted to copy, distribute and/or modify this document under the terms of the OWASP License.
The OWASP Foundation
SecTor 2008
http://www.owasp.org
Thanks Brian Bourne and Nanna Ng
Christian Heinrich aka cmlhOWASP “Google Hacking” Project Lead
[email protected]://www.linkedin.com/in/ChristianHeinrich