قالب وردپرس درنا توس
Home / Tips and Tricks / Using Photon Scanner to Scrape Web OSINT Data «Null Byte :: WonderHowTo

Using Photon Scanner to Scrape Web OSINT Data «Null Byte :: WonderHowTo



Collecting information about an online destination can be time-consuming, especially if you only need specific information about a destination with many subdomains. We can use a web crawler called Photon, developed for OSINT, to do the heavy lifting and search URLs in our name to get valuable information for a hacker.

All this is used to learn as much as possible about the goal without having to tap it to be watched. This eliminates some of the obvious methods of scanning and enumeration, and requires some creativity in finding clues.

Know what to look for

The Photon OSINT scanner fills this niche with a flexible, easy-to-use command line interface to search target web pages. Instead of just looking for vulnerabilities, Photon quickly analyzes what's out there and shows it to the hacker in an easy-to-understand way.

One of the most useful features of Photon is the ability to automatically detect and extract certain types of data. For example, page scripts, e-mail addresses, and important passwords or API keys that may be exposed by accident.

Apart from the latest websites, you can also look into the past in Photon. You can use retained pre-conditions of web pages documented on the Wayback Machine as "seed" for your search by using all URLs from the now unused web site as a source for further crawling. While using Photon requires a bit of patience and understanding for the many filters available, it does not take much to gain clues about your goal.

What you need

Photon is a popular tool It's cross-platform, so it works on any system where Python is installed. I think it crashes when running Python2, so I recommend running it with the command python3 although GitHub's instructions say so.

To check if Python is installed on your system, you can open a terminal window and type Python3 . If you have not installed it, you can install it with apt-install python3 . If your issue looks like this, you're ready to go.

  python3 
  Python 3.6.8 (default: January 3, 201
9, 03:42:36) [GCC 8.2.0] about Linux Enter "help", "copyright", "credits" or "license" for more information. >>>

Enter quit () to kill the Python shell, and we'll start installing everything we need to run Photon.

Step 1: Download and Install Photon

To start Photon, make sure that Python3 is installed. If you do, we also need to install some dependencies. In a terminal window, run the following command to download and install the required libraries:

  pip install tld requests 

When this process completes, you can download Photon and navigate to its directory with the following commands. Do not skip the line cd .

  git clone https://github.com/s0md3v/Photon.git
cd Photon 

Step 2: View photo options

Now we can python3 photon.py -h to see the list of options that can be used to scan.

  python3 photon.py -h 
.
____ __ __
/ __  / / _ ____ / / _____ ____
/ / _ / / __  / __  / __ / __  / __ 
/ ____ / / / / / _ / / / / / / / / / / / /
/ / / / _ / / _ /  ____ /  __ /  ____ / _ / / _ / v1.2.1

Usage: photon.py [-h] [-u ROOT] [-c COOK] [-r REGEX] [-e EXPORT] [-o OUTPUT]
                 [-l LEVEL] [-t THREADS] [-d DELAY] [-v]
                 [-s SEEDS [SEEDS ...]] [--stdout STD]
                 [--user-agent USER_AGENT] [--exclude EXCLUDE]
                 [--timeout TIMEOUT] [--clone] [--headers] [--dns] [--ninja]
                 [--keys] [--update] [--only-urls] [--wayback]

  optional arguments:
-h, --help View and exit this help message
-u root, --url root url
-c COOK, --cookie COOK
cookie
-r REGEX, --regex REGEX
Regex patterns
-e EXPORT, --export EXPORT
export format
-o OUTPUT, --out OUTPUT
Output directory
-l LEVEL, --level LEVEL
Levels to crawl
-t THREAD, - THREAD
Number of topics
-d DELAY, --Delay DELAY
Delay between requests
-v, --verbose detailed issue
-s SEEDS [SEEDS ...] - seed SEEDS [SEEDS ...]
additional seed URLs
--stdout STD Send variables to stdout
--user-agent USER_AGENT
custom user agents
--exclude EXCLUDE excludes URLs that match this regex
--timeout TIMEOUT http request timeout
- clone the site locally
--Headers add headers
--dns lists subdomains and DNS data
- ninja ninja mode
--keys find secret keys
--update update photon
--only URLs extract only URLs
--wayback Retrieve URLs from archive.org as Seeds 

To perform the simplest scan, the formula is python3 photon.py -u target.com .

Step 3: Map DNS Information [19659004] One of the most useful and interesting features of Photon is the ability to create a visual DNS map of everything that is connected to the domain. This gives you a comprehensive look at the type of software that runs on the computers behind the target domain.

We perform a scan with the flag – dns . To create a map of priceline.com, you can execute the command python3 photon.py -u priceline.com –dns in a terminal window.

  python3 photon.py -u https: //www.priceline.com/ --dns 
  Robots.txt: 111 retrieved URLs
Level 1: 112 URLs
Progress: 112/112
Level 2: 112 URLs
Progress: 112/112
0 JavaScript files crawl

--------------------------------------------------
Robot: 111
Internal: 112
--------------------------------------------------
Total requests: 0
Total duration: 0 minutes 26 seconds
Requests per second: 0
List subdomains
Found 79 subdomains
Generate DNS card
Results stored in the directory www.priceline.com 

The resulting subdomain map is huge! It's way too big to fit in here, so let's look at some segments. We can see servers and IP addresses associated with the Priceline service. Here is an extract:

Below are third-party integrations and other infrastructures associated with Priceline services. This also gives us information about the mail servers used and possibly poorly secured third party services that we could use to gain access. This again is a pulled out view:

Let's look at the MX record responsible for the email service. Obviously Google services and VeriSign are used.

Below we can zoom in and see that the servers Varnish, BigIP, and Nginx are detected. Connected to a Digital Ocean account, we see an Ubuntu server running a specific version of openSSH. I hope that is not prone.

A closer look at Priceline's core services looks at Microsoft, Apache, and Big IP systems. In some cases, we can see the specific versions of the services hosting these IP addresses.

All this is a bonanza for hackers seeking the most vulnerable system Connected to the target.

Step 4: Extracting Secret Keys and Intel

Next we'll try to get some email addresses and keys from a website. We will use the example of PBS.org.

To do the search, we add some other flags to increase the depth and speed of the search. In a terminal window, we can execute python3 photon.py -u pbs.org –keys -t 10 -l 3 to indicate that we want to go three levels deep with URLs and we Want to open ten threads to do the crawling. The results are returned in a file named intel, the first of which looks like this:

  python3 photon.py -u https://www.pbs.org/ --keys -t 10 -l 3 [19659011] b # delaney@delaneyantiqueclocks.com  nshcurry@pbs.org  nandrew@brunkauctions.com  nnansollo@gmail.com  nfrontlinemedia@pbs.org  ninfo@weissauctions.com  nledyer256@aol.com  nstock_sales @ wgbh. orgian.ehling@bonhams.com  nnanchisholm@gmail.com  nCollegeBehindBarsDKC@dkcnews.com  nAppIcon57x57@2x.png  nagm6@advanceguardmilitaria.com  ntravis@bruneauandco.com  nfrontline@pbs.org [19] Some E -Mail addresses recorded! We've set up a fairly broad network for this search, so many non-contiguous emails could be on our list. This is because we have thoroughly scoured three levels of URLs and probably discarded some non-contiguous sites. 

Although no keys were found on this scan, the flag we set causes Photon to search for strings that may access API keys or other key-key details that may have been unintentionally posted on the target's Web site.

Step 5: Make requests to a third party using Ninja Mode

Let's say we are working with a sensitive IP address such as a police station, a government office, or even just a home you do not want to have The goal is to visit the website. You can set the distance between yourself and the target by using the - ninja flag, which sends your requests to a third-party Web site, which directs the request to you and forwards the response.

The result is slower, but eliminates the risk of the target recognizing the IP address of the organization you are working for. As you have less control over these requirements, keep in mind that completion may take much longer.

To perform a lighter version of the previous scan in "Ninja" mode, you can execute the command python3 photon. py -u pbs.com --keys -t 10 -l 1 - ninja in a terminal window.

  python3 photon.py -u https://www.pbs.com/ - keys -t 10 -l 1 --ninja 

Photon searches URL browsing at lightning speed

Searching hundreds of URLs Information is very rare that you want to do this yourself. Photon makes it easy to crawl large volumes of subdomains or multiple targets, allowing you to scale your research during the reconstruction phase. With built-in smart options to parse and search for types of data such as email addresses and key API keys, Photon can even catch small bugs that make a target that contains lots of valuable information.

I hope you liked this tutorial Photon OSINT scanner for browsing websites for OSINT data! If you have questions about this Web Scraping tutorial, have a comment, and you're welcome to Twitter @KodyKinzie .

Don & # 39; t miss: reach. Discover hidden subdomains to reveal internal services with CT-Exposer

Cover pictures and screenshots of Kody / Null Byte




Source link