Table Of Contents :
Her, in the above image you can see that first you have to do recon on tesla.com in which we are finding ASN records, Acquisitions, Linkeddiscovery then Reverse WHOIS.
After that we are finding root domains of tesla.com and we have founded teslamotors.com, slarcity.com and etc as shown in the image.
In the above image we are enumerating sub-domains for a target as shown in the image and then we are enumerating CMS , intresting endpoints, service identification, stack identification, content discovery and many more.
Wide Recon is the art of discovering as many assets related to a target as possible. Make sure your scope permits testing these sites.
Finding Seeds/Roots domians
Scope Domains (Bugcrowd)
We will know what are in-scope and out of scope domains in a bug bounty program.
If you found valid bugs on this domains or subdomains then you will be rewarded with a bounty..
Out of Scope Domains
Don’t touch this domains bcause if you found a valid bug then also your report will be rejected by the team.
We want to continue to gather seed/root domains. Acquisitions are often a new way to expand our available assets if they are in scope. We can investigate a company’s acquisition on sites like https://crunchbase.com, wikipedia, and Google.
Here we can possibly drill down into old dominans related to Revlo, ClipMine, Curse, and GoodGame. Remember to do some Googling on these acquisitions to see if they are still owned by the parent company. Many times acquisitions will split back out or get sold to another company.
ASN Enumeration (bgp.he.net)
Autonomous System Numbers are given to large enough networks. These ASN’s will help us track down some semblance of an entity’s IT infrastructure. The most reliable way to get these
is manually through Hurricane Electric’s free-form search:
Because of the advent of cloud infrastructure, ASNs aren’t always a complete picture of a network. Rogue assets could exist on cloud environments like AWS and Azure. Here we can see several IP ranges.
ASN Enumeration (cmd line)
Some automation is available to get ASNs. One such tool is the ‘net’ switch of “Metabigor” by j3ssiejjj which will fetch ASN data from a keyword from bgp.he.net and asnlookup.com
Another is “ASNLookup” by Yassine Aboukir which utilizes the maxmind.com dataset.
One problem with cmd line enumeration is that you could return records
from another org on accident that contains the keyword ‘tesla’
ASN Enumeration (with Amass)
For discovering more seed domains we want to scan the whole ASN with a port scanner and return any root domains we see in SSL certificates, etc. We can do this with Amass intel
Amass is written by Jeff Foley and the Amass team.
Reverse WHOIS (with Whoxy.com)
Every website has some registration info on file with the registrars. Two key pieces of data we can use are Organization name and any emails in the WHOIS data. To do this you need access
to a large WHOIS database. WHOXY.com is one such database. You can use whoxy.com in this fashion, after you register and your free API key:
● http://api.whoxy.com/?key=APIkeyHERE &reverse=whois&name=Twitch+Hostmaster
Careful with reverse whois data as it is the least high fidelity source of new root/seed domains. It might include many parked domains or redirects to out of scope assets.
Reverse WHOIS (with Whoxy.com)
Reverse WHOIS (with DOMLink) DOMLink is a tool written by Vincent Yiu (@vysecurity) which will recursively query the WHOXY WHOIS API. It will start by querying our targets WHOIS record, then analyze all the data and look for other records which contain the organization name or are registered to emails in the record. It does this recursively until it finds no more records of match
Ad/Analytics Relationships (builtwith.com)
You can also glean related domains and subdomains by looking at a target’s ad/analytics tracker codes. Many sites use the same codes across all their domains. Google analytics and New Relic codes are the most common. We can look at these “relationships” via a site called BuiltWith. Builtwith also has a Chrome and Firefox extension to do this on the fly.
BuiltWith is also a tool we’ll use to profile the technology stack of a target in later slides
You can Google the:
● Copyright text
● Terms of service text
From a main target to glean related hosts on Google.
You can see the full google dorking series here
Shodan is a tool that continuously spiders infrastructure on the internet. It is much more verbose than regular spiders. It captures response data, cert data, stack profiling data, and more. It requires registration.
We can glean a valuable question: is twitch.amazon.eu relevant to our testing?
You can see full shodan dorking series here.
Linked and JS Discovery
Linked Discovery (with Burp Suite Pro)
Another way to to widen our scope is to examine all the links of our main target. We can do this using Burp Suite Pro. We can visit a seed/root and recursively spider all the links for a term with regex, examining those links… and their links, and so on… until we have found all sites that could be in our scope.
1. Turn off passive scanning
2. Set forms auto to submit (if you’re feeling frisky)
3. Set scope to advanced control and use “keyword” of target name (not a normal FQDN)
4. Walk+browse main site, then spider all hosts recursively!
Burp after requesting one site:
After the 1st spider run we’ve now discovered a ton of linked URLs that belong to our project. Not only subdomains, but NEW seeds/roots (twtchapp.net, ext-twitch.tv, twitchsvc.net).
We can also now spider these new hosts and repeat until we have Burp Spider fatigue.
Now that we have this data, how do we export it?
1) Select all hosts in the site tree
2) In PRO ONLY right click the selected hosts
3) Go to “Engagement Tools” -> “Analyze target”
4) Save report as an html file
5) Copy the hosts from the “Target” section
Linked Discovery (with GoSpider or hakrawler)
Linked discovery really just counts on using a spider recursively. One of the most extensible spiders for general automation is GoSpider written by j3ssiejjj which can be used for many things and supports parsing js very well. In addition hakrawler by hakluke has many parsing strategies that interest bug hunters.
Subdomain Enumeration (with SubDomainizer)
1. Find subdomains referenced in js files
2. Find cloud services referenced in js files
3. Use the Shannon Entropy formula to find potentially sensitive items in js files
It will take a single page and scan for js files to analyze. If just looking for subdomains subscraper by Cillian-Collins might be better because it has recursion.
Subdomain Scraping Sources
The next set of tools scrape domain information from all sorts of projects that expose databases of URLs or domains. New sources are coming out all the time so the tools must evolve constantly. This is only a small list of sources. Many more exist.
There are number of sources from which you can get subdomains like google, virustotal, bing, facebook, wayback machine and many more.
Subdomain Scraping Example (Google)
1. site:twitch.tv -www.twitch.tv
2. site:twitch.tv -www.twitch.tv -watch.twitch.tv
3. site:twitch.tv -www.twitch.tv -watch.twitch.tv -dev.twitch.tv
You can see this full series of google dorking
Subdomain Scraping (Amass)
For scraping subdomain data there are two industry leading tools at the moment; Amass and Subfinder. They parse all the “sources” referenced in the previous slide, and more. Amass has the most sources, extensible output, bruteforcing, permutation scanning, and a ton of other modes to do additional analysis of attack surfaces.
Subdomain Scraping (Subfinder v2)
Subfinder is another best in breed tool originally written by ice3man and Michael Skelton. It is now maintained by a larger group, called projectdiscovery.io It incorporates multiple sources, has extensible output, and more.
Subdomain Scraping Example (Sudomy)
Subdomain Scraping (shosubgo)
Another valuable bespoke scraping technique is gathering subdomains from Shodan. Shosubgo is a Go script written by inc0gbyt3 which is effective and reliable for this method
Subdomain Scraping (Cloud Ranges)
A highly valuable technique is to monitor whole cloud ranges of AWS, GCP, and Azure for SSL sites, and parse their certificates to match your target. Doing this is cumbersome on your own but possible with something like masscan. Daehee Park outlines it here.
Luckily Sam Erb did a wonderful defcon talk on this and created a service which scans every week. Some bash scripting required 😉
At this point we move into guessing for live subdomains. If we try and resolve
thistotallydoesntexist.company.com we will *usually* not get a record.
So we can use a large list of common subdomain names and just try and resolve them analyzing if they succeed. The problem in this method is that only using one DNS server to do this will take forever. Some tools have come out that are both threaded and use multiple DNS resolvers simultaneously. This speeds up this process significantly. Massdns by @blechschmidt
pioneered this idea.
Subdomain Bruting (Amass)Amass (8 revolvers by default) does this with the -rf flag.
Subdomain Bruting (Amass)
Amass offers bruteforcing via the “enum” tool using the “brute” switch.
● amass enum -brute -d twitch.tv -src It has a built in list but you can specify your own lists. You can also specify any number of resolvers
● amass enum -brute -d twitch.tv -rf resolvers.txt -w bruteforce.list Doing this with amass also gives us the opportunity to resolve the found domains. I haven’t checked out aisdnsbrute yet but i’ve heard it’s fast.
Subdomain Bruting (shuffleDNS)
Subdomain Bruting (shuffleDNS)
Subdomain Bruting Lists A multi resolver, threaded subdomain bruter is only as good as it’s wordlist.
There are two trains of thought here:
● Tailored wordlists
● Massive wordlists
Both have advantages.
My all.txt file is still what i use on a regular basis. It
combines 7 years of DNS bruteforce lists into one.
Subdomain Bruting Lists
Subdomain Bruting Lists New lists for subdomain bruteforce are relatively the same nowadays, but the 1st team to really iterate on this is the AssetNote team with their commonspeak data collection.
The all.txt file includes commonspeak v1 data but there is also a second version of commonspeak data out:
When bruteforcing or gathering subdomains via scraping you may come across a naming pattern in these subdomains. Even though you may not have found it yet, there may be other targets that conform to naming conventions.
In addition, sometimes targets are not explicitly protected across naming conventions. The first tool to attempt to recognize these patterns and bruteforce for some of them was altdns written by Naffy and Shubs.
Now amass contains logic to check for these “permutations”. Amass includes this analysis in a default run Some personal experience cited on the next page.
Port Analysis (masscan)
Most hacker education would have you use nmap here, but masscan by Robert Graham is much faster for general “finding-open-ports-on-TCP”. Chaining masscan’s output to then be nmap’ed can save a lot of time.
Masscan achieves this speed with a re-written TCP/IP stack, true multi-threading, and is written in C. Sample syntax for scanning a list of IPs:
● masscan -p1-65535 -iL $ipFile –max-rate 1800 -oG $outPutFile.log
A full syntax guide of masscan (authored by Daniel Miessler) can be found here:
Port Analysis (dnmasscan)
One limitation of masscan is that it only scans IP addresses. You can write you own simple converter script or you can use something like dnmasscan by @rastating
Service Scanning (brutespray)
When we get this service/port information we can feed it to nmap to get a OG outputfile. We can then scan the remote administration protocols for default passwords with a tool called
Burtespray by @x90skysn3k which takes the nmap OG file format.
Github Dorking (manual)
Many organizations quickly grow in their engineering teams. Sooner or later a new developer, intern, contractor, or other staff will leak source code online, usually through a public Github.com repo that they mistakenly thought they had set private. Enjoy my quick github dork collection:
The repo mentioned earlier by Gwendal Le Coguic called “github-search” has some automated github tools for this as well. Also check out @th3g3ntelman’s full module on Github and
Sensitive data Exposure.
Screenshotting (Eyewitness, Aquatone, httpscreenshot)
At this point we have a lot of attack surface. We can feed possible domains to a tool and attempt to screenshot the results. This will allow us to “eye-ball” things that might be interesting. There are many tools for this. Aquatone is a wider recon framework that does this, HTTPscreenshot, and Eyewitness. I use Eyewitness because it will prepend both the http and https protocol for each domain we have observed. I’m not highly tied to this tool though, find one that works for you.
Subdomain takeover (can i take over xyz)
“Subdomain takeover vulnerabilities occur when a subdomain (subdomain.example.com) is pointing to a service (e.g. GitHub pages, Heroku, etc.) that has been removed or deleted. This allows an attacker to set up a page on the service that was being used and point their page to that subdomain. For example, if subdomain.example.com was pointing to a GitHub page and the user decided to delete their GitHub page, an attacker can now create a GitHub page, add a CNAME file containing subdomain.example.com, and claim subdomain.example.com.”
A great resource for subdomain takover is Ed Overlow’s repo can-i-take-over-xyz
Subdomain takeover (SubOver & nuclei)
To find subdomain takeovers we can use a few tools. SubOver is a discontinued stand alone tool by Ice3man and has since been incorporated to Project Discovery’s nuclei scanner. Nuclei is part of a larger scanning framework but boasts the most takeover checks of any tool i’ve seen.
Extending tools (interlace)
Eventually you will want to make script or recon framework of you own. Quickly you will come up against some problems:
● Not all tools extend to take different sources of input
● Some lack threading.
● Not all can be distributed
You can rewrite a tool yourself to handle these issues but some help does exist here. Interlace by Michael Skelton aka Codingo is an awesome tool than help glue together a recon framework.
Interlace can take these tools and add support for: CIDR input, Glob input, threading, proxying, queued commands, and more. Hakluke wrote a great guide on it here.
Extending tools (anything TomNomNom writes)
Tomnomnom has an extensive repo of tools which are awesome. I highly suggest you check them all out.
Extending tools (interlace)
It could be recon is not really your thing. That’s all right.
Several hunters have open sourced their automation at this point and you can choose one that fits you and use it without worrying too much. I usually classify recon frameworks in rough tiers:
C-Tier: automation built around scripting up other tools in bash or python. Step based, no workflow. Few techniques. Little extensibility.
B-Tier: automation writing a few of their own modules. Some GUI or advanced workflow. Medium techniques. Runs point-in-time. Flat files.
A-Tier: automation writing all their own modules. Has GUI. Runs iterativley. Manages data via db.
S-Tier: automation writing their own modules. Has GUI. Runs iterativley. Manages data via db. Scales across multiple boxes. Sends alerts to user. Uses novel techniques and iterates quickly. ML + AI.