CyberSherlock

📄

Project Abstract

CyberSherlock is an AI-assisted External Attack Surface Management (EASM) system developed at the University of Bahrain that addresses organizations' struggle to identify unmanaged assets and evaluate their associated risks. It utilizes passive Open-Source Intelligence (OSINT) techniques and HTML parsing to map infrastructure components such as DNS records, SSL certificates, and IP addresses. Built on Python and Flask, the system integrates a specialized AI security advisor aligned with OWASP standards. Testing on environments like OWASP Juice Shop demonstrated the system's efficiency in identifying critical vulnerabilities and outdated protocols within seconds.

The Problem

Organizations Don't Know
What Attackers Can See

As organizations increasingly depend on internet-facing services, websites, APIs, cloud platforms, and third-party providers — managing and understanding external cybersecurity exposure has become more challenging than ever.

🌐

DNS records publicly expose internal infrastructure layout

High Risk

📋

WHOIS data reveals admin contacts, registrar & org structure

Medium

🔓

Expired or misconfigured SSL certificates create trust issues

Critical

🕸️

Third-party dependencies with outdated or vulnerable software

High Risk

📱

Social media reveals employee roles, email formats & technologies

Medium

🔍

Forgotten subdomains and cloud instances expand attack surface

High Risk

The Blind Spot Problem

Organizations continuously publish data across websites, social media, repositories, and professional networks. This large amount of publicly accessible information creates new cybersecurity concerns as attackers can gather and correlate exposed data from multiple online sources.

Unmanaged External Assets

Developers constantly add new websites, services, and applications while older assets may be overlooked. Digital boundaries have become dynamic due to cloud instances, forgotten subdomains, and third-party APIs — creating an invisible perimeter.

Manual Analysis Doesn't Scale

Information such as employee details, email structures, technologies, DNS records, and infrastructure-related data may unintentionally reveal valuable insights about an organization's environment to any attacker.

No Structured Risk Visibility

Without automated scoring and AI-guided recommendations, security teams cannot effectively prioritize which exposures pose the greatest risk or determine whether publicly available information may introduce cybersecurity risks.

The Solution

Passive. Intelligent.
Actionable.

CyberSherlock operates as a passive OSINT analysis tool that processes uploaded HTML files to collect, analyze, and present publicly available cybersecurity exposure information in a structured and user-friendly manner — without performing intrusive scanning, penetration testing, or exploitation activities.

📄

HTML Upload

Locally saved webpage HTML is the primary passive data source

→

🥣

BeautifulSoup Parse

Extracts links, hosts, scripts, forms & external references

→

🌐

Host Discovery

Unique hostnames, subdomains & main domain identified

→

🛰️

OSINT Collection

DNS, WHOIS, SSL, IP, ASN data from public sources

→

📊

Risk Scoring

Severity-weighted baseline score across all findings

→

🤖

AI Advisor

OWASP-aligned recommendations via AI chatbot

Core Features

Eight Intelligence Modules
for Complete Exposure Mapping

Each module collects, analyzes, and presents a specific dimension of your organization's external digital footprint using only publicly accessible information.

🛰️

Passive OSINT Collection

Aggregates publicly available information from DNS records, WHOIS information, SSL certificate data, IP ownership, ASN details, and reverse DNS — all without sending intrusive packets to the target.

Core Engine

🌐

DNS & WHOIS Analysis

Collects record types A, AAAA, MX, NS, TXT, and CNAME along with WHOIS registration details such as domain name, registrar, creation date, expiration date, organization, and name servers.

DNS Intelligence

🔐

SSL Certificate Inspection

Shows certificate status, subject common name, issuer, validity dates, days remaining before expiration, serial number, and Subject Alternative Names (SAN) to detect misconfigurations.

TLS Analysis

🗺️

IP Ownership Mapping

Displays IP address, ASN, ASN description, network name, and country for all identified IP addresses — providing context about discovered hosting infrastructure and network ownership.

Network Intelligence

📡

External Exposure Analysis

Identifies publicly exposed assets, third-party dependencies, unresolved hosts, and potentially risky external services. Shows hostname, classification, risk level, resolved IP, and reverse DNS.

Host Analysis

📊

Baseline Risk Scoring

Generates baseline risk scores based on collected findings and categorizes them into Low, Medium, High, and Critical severity levels — evaluating exposed hosts, third-party dependencies, and detected findings.

Risk Engine

📱

Social Media Intelligence

Collects publicly available social media information related to organizations, domains, and employees to identify exposed organizational references and information that could support social engineering.

SOCMINT

🤖

AI Security Advisor

Provides AI-assisted analysis and remediation recommendations aligned with OWASP security practices. The AI chatbot uses collected OSINT findings and baseline risk analysis to generate contextual recommendations.

AI-Powered

Risk Model

Baseline Risk Scoring Model

CyberSherlock evaluates findings across multiple categories and assigns severity-weighted scores to produce a consolidated risk level for the organization's external exposure.

Finding Category	Severity	Description
Zone Transfer Enabled	Critical	Allows full DNS namespace enumeration by attackers
Expired SSL Certificate	High	Certificate has passed its validity date — untrusted connection
SSL Certificate Expiring Soon	High	Certificate will expire within 30 days — renewal required
Third-Party Hosts Detected	Medium	External hosts referenced — may expand attack surface
WHOIS Admin Contact Exposed	Medium	Public WHOIS reveals admin email and contact information
Missing Certificate Information	Medium	SSL certificate data incomplete or unavailable
Social Media Exposure	Low	Organizational or employee data discoverable via public profiles
Resolved External IP Detected	Low	IP address publicly associated with target infrastructure

Live Dashboard

Real Dashboard Screenshots

The CyberSherlock dashboard presents collected OSINT data through a centralized single-page interface with organized sections, contextual explanations, and structured visual presentation.

Dashboard Overview — Summary Cards

Displays organization name, main domain, discovered subdomains, total hosts, unique IPs, and baseline risk score at a glance.

🔒 localhost:5000 — CyberSherlock Dashboard

Security Findings & Exposure Analysis

Each finding is displayed with severity level (Low / Medium / High / Critical), category, title, and a simplified explanation for users with limited technical knowledge.

🔒 localhost:5000 — Security Findings Interface

Domain Registration & Subdomain Information

Shows domain name, registrar, creation date, expiration date, organization, and name servers — alongside discovered subdomains from the saved HTML file.

🔒 localhost:5000 — Domain & Subdomain Information

Domain and Subdomain Information Interface

Host Analysis Interface

Shows each discovered host with its classification, risk level, risk score, resolved IP address, reverse DNS value, and the reason behind the assigned risk.

🔒 localhost:5000 — Host Analysis

SSL Certificate Information

Displays certificate status, subject common name, issuer, validity dates, days remaining before expiration, serial number, and Subject Alternative Names (SAN).

🔒 localhost:5000 — SSL Certificates

DNS Records Interface

Displays record types A, AAAA, MX, NS, TXT, and CNAME with their values and a short explanation of each record type to help users understand the domain's public configuration.

🔒 localhost:5000 — DNS Records

IP Ownership & Network Information

Displays IP address, ASN, ASN description, network name, and country for each identified IP — providing additional context about discovered infrastructure ownership.

🔒 localhost:5000 — IP Ownership & Network

Social Media Intelligence Interface

An Unified monitor for real-time activity tracking (Active/Inactive) in social media accounts Instagram, Facebook, and X etc...

🔒 localhost:5000 — Social Media Intelligence Interface

CyberSherlock AI Security Advisor

An interactive AI chatbot that answers security questions and provides OWASP-aligned remediation recommendations based on collected OSINT findings and baseline risk analysis.

🔒 localhost:5000 — AI Security Advisor

How It Works

From HTML File to
Complete Intelligence Report

A structured six-stage pipeline that transforms a saved webpage HTML into a complete external attack surface analysis — all using passive techniques only.

Upload HTML Source

A locally saved copy of the organization's main webpage HTML file is provided as the primary data source. The Flask application verifies the file and reads its content into memory for processing.

→

Extract Hosts & Domains

BeautifulSoup parses the HTML and scans tags for attributes like href, src, action, data-src, and srcset. Hostnames are extracted, normalized, and deduplicated. The most-repeated registered domain is selected as the main domain.

→

Collect Passive OSINT

The Passive OSINT Data Collection Module queries public APIs and databases for DNS records, WHOIS information, SSL certificate details, reverse DNS, ASN details, and IP ownership information — all without direct interaction with the target.

Analyze External Exposure

Collected OSINT data is analyzed to identify publicly exposed assets, third-party dependencies, unresolved hosts, and potentially risky external services associated with the target organization.

→

Generate Risk Score

A severity-weighted scoring algorithm evaluates each finding — exposed hosts, third-party dependencies, missing information, and detected vulnerabilities — to produce a consolidated Baseline Risk Score from 0–100 with category breakdowns.

→

Social Media

This feature automatically identifies the target's presence across multiple platforms, including Instagram, X (Twitter), LinkedIn, and others. By scanning the HTML source, CyberSherlock extracts these social handles and links them together. This process helps in visualizing the full digital footprint of the target, making it easier to see how different accounts are interconnected and related to the main domain.

AI Recommendations

The CyberSherlock AI Security Advisor processes all findings and generates prioritized, context-aware remediation recommendations aligned with OWASP security practices to help users understand and mitigate identified risks.

AI Advisor — Real Interaction Log

CyberSherlock AI Security Advisor

The AI chatbot uses collected OSINT findings and baseline risk analysis to generate contextual, OWASP-aligned security recommendations. Below are actual interaction examples from system testing.

🤖

CyberSherlock Security Advisor

OWASP-aligned AI · Powered by collected OSINT findings

👤

Can you justify the current Risk Score? Tell me which factor contributed most to this level.

Based on the EASM scan results provided, the overall Risk Score is 22/100 (Low). The main contributor is the "Third-party hosts detected" finding with a severity of Medium. This indicates one external host was found in the saved HTML, which may expand external exposure. The finding suggests dependencies on external resources that are outside direct organizational control.

👤

Provide a top-3 priority action plan for the IT team to reduce our external attack surface.

[SECURITY ALERT] Based on the results, I recommend:

1. Vulnerability Remediation: Review identified third-party hosts (e.g., external domains found in HTML source) and assess whether they are necessary dependencies.

2. Network Segmentation: Isolate and restrict access to sensitive areas to minimize the impact of external dependency exposure.

3. Regular Security Audits: Schedule periodic passive OSINT monitoring to detect new threats and changes to your external attack surface in a timely manner.

👤

Given that "Third-party hosts" are the main risk factor, provide a technical recommendation to harden web assets and mitigate potential supply chain attacks.

Technical Recommendation:

1. Inventory and Monitoring: Conduct a comprehensive inventory of all third-party libraries, CDNs, and external scripts referenced in your web assets.

2. Vulnerability Scanning: Perform regular passive scans against known CVE databases for all identified third-party dependencies to detect outdated or vulnerable components.

3. Subresource Integrity (SRI): Implement SRI hashes on all externally loaded scripts and stylesheets to prevent tampering — as recommended by OWASP A06: Vulnerable and Outdated Components.

Ethical Security Research

Built on Responsible Intelligence

CyberSherlock follows ethical cybersecurity practices by relying only on passive Open-Source Intelligence (OSINT) collection methods. The platform avoids intrusive scanning, penetration testing, exploitation activities, or direct interaction with target systems.

Active Scanning · Exploitation · Penetration Testing · ✓ Passive OSINT Only

🛡️

Zero Intrusion

CyberSherlock never sends probes or intrusive requests to target systems. All data is sourced from public records without generating detectable network traffic.

⚖️

Legal & Compliant

Operates exclusively using publicly accessible information only — no unauthorized access, no grey-area techniques, no credential harvesting.

📖

OWASP Aligned

All AI recommendations and risk assessments follow OWASP Top 10 and established security frameworks, validated before being presented to users.

Passive OSINT Only

OWASP Top 10 Aligned

No Active Scanning

No Penetration Testing

No Credential Harvesting

Responsible Disclosure

Tested on OWASP Juice Shop

Project Abstract

Organizations Don't KnowWhat Attackers Can See

The Blind Spot Problem

Unmanaged External Assets

Manual Analysis Doesn't Scale

No Structured Risk Visibility

Passive. Intelligent.Actionable.

HTML Upload

BeautifulSoup Parse

Host Discovery

OSINT Collection

Risk Scoring

AI Advisor

Eight Intelligence Modulesfor Complete Exposure Mapping

Passive OSINT Collection

DNS & WHOIS Analysis

SSL Certificate Inspection

IP Ownership Mapping

External Exposure Analysis

Baseline Risk Scoring

Social Media Intelligence

AI Security Advisor

Baseline Risk Scoring Model

Real Dashboard Screenshots

Dashboard Overview — Summary Cards

Security Findings & Exposure Analysis

Domain Registration & Subdomain Information

Host Analysis Interface

SSL Certificate Information

DNS Records Interface

IP Ownership & Network Information

Social Media Intelligence Interface

CyberSherlock AI Security Advisor

From HTML File toComplete Intelligence Report

Upload HTML Source

Extract Hosts & Domains

Collect Passive OSINT

Analyze External Exposure

Generate Risk Score

Social Media

AI Recommendations

CyberSherlock AI Security Advisor

CyberSherlock Security Advisor

Built on Responsible Intelligence

Zero Intrusion

Legal & Compliant

OWASP Aligned

Understand your public exposurebefore attackers do.

Organizations Don't Know
What Attackers Can See

Passive. Intelligent.
Actionable.

Eight Intelligence Modules
for Complete Exposure Mapping

From HTML File to
Complete Intelligence Report

Understand your public exposure
before attackers do.