The bug hunter’s methodology v4 recon (TBHMV4)

Table Of Contents :
the bug bounty methodlogy

This methodlogy was written by Jhaddix and the credit goes to him you can follow him on twitter.

You need to take notes there are some good note taking softwares like notepad++ , vim,and etc.

 

Project Tracking

Project Tracking

Her, in the above image you can see that first you have to do recon on tesla.com in which we are finding ASN records, Acquisitions, Linkeddiscovery then Reverse WHOIS.

After that we are finding root domains of tesla.com  and we have founded teslamotors.com, slarcity.com and etc as shown in the image.

 

In the above image we are enumerating sub-domains for a target as shown in the image and then we are enumerating CMS , intresting endpoints, service identification, stack identification, content discovery and many more.

Mission

Wide Recon is the art of discovering as many assets related to a target as possible. Make sure your scope permits testing these sites.

Finding Seeds/Roots domians

Scope Domains (Bugcrowd)

We will know what are in-scope and out of scope domains in a bug bounty program.

In-Scope Domains
In-scope domains

If you found valid bugs on this domains or subdomains then you will be rewarded with a bounty..

Out of Scope Domains
Out of scope domains

Don’t touch this domains bcause if you found a valid bug then also your report will be rejected by the team.

Acquisitions (Crunchbase)

Crunchbase

We want to continue to gather seed/root domains. Acquisitions are often a new way to expand our available assets if they are in scope. We can investigate a company’s acquisition on sites like https://crunchbase.com, wikipedia, and Google.

Here we can possibly drill down into old dominans related to Revlo, ClipMine, Curse, and GoodGame. Remember to do some Googling on these acquisitions to see if they are still owned by the parent company. Many times acquisitions will split back out or get sold to another company.

ASN Enumeration (bgp.he.net)

asn enumeration

Autonomous System Numbers are given to large enough networks. These ASN’s will help us track down some semblance of an entity’s IT infrastructure. The most reliable way to get these
is manually through Hurricane Electric’s free-form search:

http://bgp.he.net

Because of the advent of cloud infrastructure, ASNs aren’t always a complete picture of a network. Rogue assets could exist on cloud environments like AWS and Azure. Here we can see several IP ranges.

ASN Enumeration (cmd line)

Some automation is available to get ASNs. One such tool is the ‘net’ switch of “Metabigor” by j3ssiejjj which will fetch ASN data from a keyword from bgp.he.net and asnlookup.com

cmd line

Another is “ASNLookup” by Yassine Aboukir which utilizes the maxmind.com dataset.
One problem with cmd line enumeration is that you could return records
from another org on accident that contains the keyword ‘tesla’

cmd line

ASN Enumeration (with Amass)

For discovering more seed domains we want to scan the whole ASN with a port scanner and return any root domains we see in SSL certificates, etc. We can do this with Amass intel
Amass is written by Jeff Foley and the Amass team.

amass

Reverse WHOIS (with Whoxy.com)

Every website has some registration info on file with the registrars. Two key pieces of data we can use are Organization name and any emails in the WHOIS data. To do this you need access
to a large WHOIS database. WHOXY.com is one such database. You can use whoxy.com in this fashion, after you register and your free API key:

http://api.whoxy.com/?key=APIkeyHERE &reverse=whois&name=Twitch+Hostmaster

Careful with reverse whois data as it is the least high fidelity source of new root/seed domains. It might include many parked domains or redirects to out of scope assets.

whoxy.com

Reverse WHOIS (with Whoxy.com)

Reverse WHOIS (with DOMLink) DOMLink is a tool written by Vincent Yiu (@vysecurity) which will recursively query the WHOXY WHOIS API. It will start by querying our targets WHOIS record, then analyze all the data and look for other records which contain the organization name or are registered to emails in the record. It does this recursively until it finds no more records of match

Ad/Analytics Relationships (builtwith.com)

You can also glean related domains and subdomains by looking at a target’s ad/analytics tracker codes. Many sites use the same codes across all their domains. Google analytics and New Relic codes are the most common. We can look at these “relationships” via a site called BuiltWith. Builtwith also has a Chrome and Firefox extension to do this on the fly.

https://builtwith.com/relationships/twitch.tv

BuiltWith is also a tool we’ll use to profile the technology stack of a target in later slides

builtwith

Google-Fu

You can Google the:

Copyright text
Terms of service text
Privacy policy text

From a main target to glean related hosts on Google.

google dorking

You can see the  full google dorking series here

Shodan

Shodan is a tool that continuously spiders infrastructure on the internet. It is much more verbose than regular spiders. It captures response data, cert data, stack profiling data, and more. It requires registration.

Example:
https://www.shodan.io/search?query=twitch.tv

We can glean a valuable question: is twitch.amazon.eu relevant to our testing?

You can see full shodan dorking series here.

Finding Subdomains

Subdomain enumeration

Linked and JS Discovery

Linked Discovery (with Burp Suite Pro)

Another way to to widen our scope is to examine all the links of our main target. We can do this using Burp Suite Pro. We can visit a seed/root and recursively spider all the links for a term with regex, examining those links… and their links, and so on… until we have found all sites that could be in our scope.

1. Turn off passive scanning
2. Set forms auto to submit (if you’re feeling frisky)
3. Set scope to advanced control and use “keyword” of target name (not a normal FQDN)
4. Walk+browse main site, then spider all hosts recursively!
5. Profit

Burp after requesting one site:

Bug hunter
Bug hunter
Bug hunter
Bug hunter

After the 1st spider run we’ve now discovered a ton of linked URLs that belong to our project. Not only subdomains, but NEW seeds/roots (twtchapp.net, ext-twitch.tv, twitchsvc.net).
We can also now spider these new hosts and repeat until we have Burp Spider fatigue.

Bug hunter

Now that we have this data, how do we export it?

1) Select all hosts in the site tree

2) In PRO ONLY right click the selected hosts

3) Go to “Engagement Tools” -> “Analyze target”

4) Save report as an html file

5) Copy the hosts from the “Target” section

Linked Discovery (with GoSpider or hakrawler)

Linked discovery really just counts on using a spider recursively. One of the most extensible spiders for general automation is GoSpider written by j3ssiejjj which can be used for many things and supports parsing js very well. In addition hakrawler by hakluke has many parsing strategies that interest bug hunters.

Gospider and hakrawler

Subdomain Enumeration (with SubDomainizer)

Subdomain Enumeration (with SubDomainizer) Subdomainizer by Neeraj Edwards is a tool with three purposes in analyzing javascript. It will take a page and run

1. Find subdomains referenced in js files
2. Find cloud services referenced in js files
3. Use the Shannon Entropy formula to find potentially sensitive items in js files

It will take a single page and scan for js files to analyze. If just looking for subdomains subscraper by Cillian-Collins might be better because it has recursion.

SubDomainizer

Subdomain Scraping Sources

The next set of tools scrape domain information from all sorts of projects that expose databases of URLs or domains. New sources are coming out all the time so the tools must evolve constantly. This is only a small list of sources. Many more exist.

Sources of subdomain

There are number of sources from which you can get subdomains like google, virustotal, bing, facebook, wayback machine and many more.

Subdomain Scraping Example (Google)

1. site:twitch.tv -www.twitch.tv
2. site:twitch.tv -www.twitch.tv -watch.twitch.tv
3. site:twitch.tv -www.twitch.tv -watch.twitch.tv -dev.twitch.tv

You can see this full series of google dorking

google

Subdomain Scraping (Amass)

For scraping subdomain data there are two industry leading tools at the moment; Amass and Subfinder. They parse all the “sources” referenced in the previous slide, and more. Amass has the most sources, extensible output, bruteforcing, permutation scanning, and a ton of other modes to do additional analysis of attack surfaces.

Amass

Subdomain Scraping (Subfinder v2)

Subfinder is another best in breed tool originally written by ice3man and Michael Skelton. It is now maintained by a larger group, called projectdiscovery.io It incorporates multiple sources, has extensible output, and more.

Subdomain Scraping Example (Sudomy)

Subdomain Scraping (shosubgo)

Another valuable bespoke scraping technique is gathering subdomains from Shodan. Shosubgo is a Go script written by inc0gbyt3 which is effective and reliable for this method

subdomain

Subdomain Scraping (Cloud Ranges)

A highly valuable technique is to monitor whole cloud ranges of AWS, GCP, and Azure for SSL sites, and parse their certificates to match your target. Doing this is cumbersome on your own but possible with something like masscan. Daehee Park outlines it here.

Luckily Sam Erb did a wonderful defcon talk on this and created a service which scans every week. Some bash scripting required 😉

AWS

Subdomain Bruteforce

At this point we move into guessing for live subdomains. If we try and resolve

thistotallydoesntexist.company.com we will *usually* not get a record.

So we can use a large list of common subdomain names and just try and resolve them analyzing if they succeed. The problem in this method is that only using one DNS server to do this will take forever. Some tools have come out that are both threaded and use multiple DNS resolvers simultaneously. This speeds up this process significantly. Massdns by @blechschmidt
pioneered this idea.

Subdomain Bruting (Amass)Amass (8 revolvers by default) does this with the -rf flag.

Subdomain Bruting (Amass)

Amass offers bruteforcing via the “enum” tool using the “brute” switch.

amass enum -brute -d twitch.tv -src It has a built in list but you can specify your own lists. You can also specify any number of resolvers

amass enum -brute -d twitch.tv -rf resolvers.txt -w bruteforce.list Doing this with amass also gives us the opportunity to resolve the found domains. I haven’t checked out aisdnsbrute yet but i’ve heard it’s fast.

Subdomain bruteforcing

Subdomain Bruting (shuffleDNS)

Subdomain Bruting (shuffleDNS)

Subdomain Bruting Lists A multi resolver, threaded subdomain bruter is only as good as it’s wordlist.

There are two trains of thought here:

Tailored wordlists
Massive wordlists

Both have advantages.
My all.txt file is still what i use on a regular basis. It
combines 7 years of DNS bruteforce lists into one.

Subdomain Bruting Lists

Subdomain Bruting Lists New lists for subdomain bruteforce are relatively the same nowadays, but the 1st team to really iterate on this is the AssetNote team with their commonspeak data collection.

The all.txt file includes commonspeak v1 data but there is also a second version of commonspeak data out:

https://github.com/assetnote/commonspeak2

Alteration Scanning

When bruteforcing or gathering subdomains via scraping you may come across a naming pattern in these subdomains. Even though you may not have found it yet, there may be other targets that conform to naming conventions.

In addition, sometimes targets are not explicitly protected across naming conventions. The first tool to attempt to recognize these patterns and bruteforce for some of them was altdns written by Naffy and Shubs.

Now amass contains logic to check for these “permutations”. Amass includes this analysis in a default run Some personal experience cited on the next page.

Alteration Scanning

Other

Port Analysis (masscan)

Most hacker education would have you use nmap here, but masscan by Robert Graham is much faster for general “finding-open-ports-on-TCP”. Chaining masscan’s output to then be nmap’ed can save a lot of time.

Masscan achieves this speed with a re-written TCP/IP stack, true multi-threading, and is written in C. Sample syntax for scanning a list of IPs:

masscan -p1-65535 -iL $ipFile –max-rate 1800 -oG $outPutFile.log

A full syntax guide of masscan (authored by Daniel Miessler) can be found here:
https://danielmiessler.com/study/masscan/

Port Analysis (dnmasscan)

One limitation of masscan is that it only scans IP addresses. You can write you own simple converter script or you can use something like dnmasscan by @rastating

dnmasscan

Service Scanning (brutespray)

When we get this service/port information we can feed it to nmap to get a OG outputfile. We can then scan the remote administration protocols for default passwords with a tool called
Burtespray by @x90skysn3k which takes the nmap OG file format.

Github Dorking (manual)

Many organizations quickly grow in their engineering teams. Sooner or later a new developer, intern, contractor, or other staff will leak source code online, usually through a public Github.com repo that they mistakenly thought they had set private. Enjoy my quick github dork collection:

https://gist.github.com/jhaddix/1fb7ab2409ab579178d2a799599

The repo mentioned earlier by Gwendal Le Coguic called “github-search” has some automated github tools for this as well. Also check out @th3g3ntelman’s full module on Github and
Sensitive data Exposure.

Screenshotting (Eyewitness, Aquatone, httpscreenshot)

At this point we have a lot of attack surface. We can feed possible domains to a tool and attempt to screenshot the results. This will allow us to “eye-ball” things that might be interesting. There are many tools for this. Aquatone is a wider recon framework that does this, HTTPscreenshot, and Eyewitness. I use Eyewitness because it will prepend both the http and https protocol for each domain we have observed. I’m not highly tied to this tool though, find one that works for you.

Subdomain takeover (can i take over xyz)

“Subdomain takeover vulnerabilities occur when a subdomain (subdomain.example.com) is pointing to a service (e.g. GitHub pages, Heroku, etc.) that has been removed or deleted. This allows an attacker to set up a page on the service that was being used and point their page to that subdomain. For example, if subdomain.example.com was pointing to a GitHub page and the user decided to delete their GitHub page, an attacker can now create a GitHub page, add a CNAME file containing subdomain.example.com, and claim subdomain.example.com.”
A great resource for subdomain takover is Ed Overlow’s repo can-i-take-over-xyz

Subdomain takeover (SubOver & nuclei)

To find subdomain takeovers we can use a few tools. SubOver is a discontinued stand alone tool by Ice3man and has since been incorporated to Project Discovery’s nuclei scanner. Nuclei is part of a larger scanning framework but boasts the most takeover checks of any tool i’ve seen.

Automation++

Extending tools (interlace)

Eventually you will want to make script or recon framework of you own. Quickly you will come up against some problems:

Not all tools extend to take different sources of input
Some lack threading.
Not all can be distributed

You can rewrite a tool yourself to handle these issues but some help does exist here. Interlace by Michael Skelton aka Codingo is an awesome tool than help glue together a recon framework.

Interlace can take these tools and add support for: CIDR input, Glob input, threading, proxying, queued commands, and more. Hakluke wrote a great guide on it here.

Extending tools (anything TomNomNom writes)

Tomnomnom has an extensive repo of tools which are awesome. I highly suggest you check them all out.

Frameworks

Extending tools (interlace)

It could be recon is not really your thing. That’s all right.

Several hunters have open sourced their automation at this point and you can choose one that fits you and use it without worrying too much. I usually classify recon frameworks in rough tiers:

C-Tier: automation built around scripting up other tools in bash or python. Step based, no workflow. Few techniques. Little extensibility.

B-Tier: automation writing a few of their own modules. Some GUI or advanced workflow. Medium techniques. Runs point-in-time. Flat files.

A-Tier: automation writing all their own modules. Has GUI. Runs iterativley. Manages data via db.

S-Tier: automation writing their own modules. Has GUI. Runs iterativley. Manages data via db. Scales across multiple boxes. Sends alerts to user. Uses novel techniques and iterates quickly. ML + AI.

Frameworks (S-Tier)

Spread the love

Leave a Comment

Your email address will not be published. Required fields are marked *