University of Bahrain University of Bahrain · College of Information Technology · B.Sc. Cybersecurity
AI-Powered Cybersecurity Intelligence · ITCY 499 Senior Project

CyberSherlock

AI-Assisted External Attack Surface Management (EASM) platform that utilizes passive Open-Source Intelligence (OSINT) to map and analyze your organization's publicly exposed digital assets — without intrusion.

8+
OSINT Modules
90%
Passive Only
OWASP
Aligned AI
Flask
Python Backend
📄

Project Abstract

CyberSherlock is an AI-assisted External Attack Surface Management (EASM) system developed at the University of Bahrain that addresses organizations' struggle to identify unmanaged assets and evaluate their associated risks. It utilizes passive Open-Source Intelligence (OSINT) techniques and HTML parsing to map infrastructure components such as DNS records, SSL certificates, and IP addresses. Built on Python and Flask, the system integrates a specialized AI security advisor aligned with OWASP standards. Testing on environments like OWASP Juice Shop demonstrated the system's efficiency in identifying critical vulnerabilities and outdated protocols within seconds.

The Problem

Organizations Don't Know
What Attackers Can See

As organizations increasingly depend on internet-facing services, websites, APIs, cloud platforms, and third-party providers — managing and understanding external cybersecurity exposure has become more challenging than ever.

🌐
DNS records publicly expose internal infrastructure layout
High Risk
📋
WHOIS data reveals admin contacts, registrar & org structure
Medium
🔓
Expired or misconfigured SSL certificates create trust issues
Critical
🕸️
Third-party dependencies with outdated or vulnerable software
High Risk
📱
Social media reveals employee roles, email formats & technologies
Medium
🔍
Forgotten subdomains and cloud instances expand attack surface
High Risk
01

The Blind Spot Problem

Organizations continuously publish data across websites, social media, repositories, and professional networks. This large amount of publicly accessible information creates new cybersecurity concerns as attackers can gather and correlate exposed data from multiple online sources.

02

Unmanaged External Assets

Developers constantly add new websites, services, and applications while older assets may be overlooked. Digital boundaries have become dynamic due to cloud instances, forgotten subdomains, and third-party APIs — creating an invisible perimeter.

03

Manual Analysis Doesn't Scale

Information such as employee details, email structures, technologies, DNS records, and infrastructure-related data may unintentionally reveal valuable insights about an organization's environment to any attacker.

04

No Structured Risk Visibility

Without automated scoring and AI-guided recommendations, security teams cannot effectively prioritize which exposures pose the greatest risk or determine whether publicly available information may introduce cybersecurity risks.

The Solution

Passive. Intelligent.
Actionable.

CyberSherlock operates as a passive OSINT analysis tool that processes uploaded HTML files to collect, analyze, and present publicly available cybersecurity exposure information in a structured and user-friendly manner — without performing intrusive scanning, penetration testing, or exploitation activities.

📄

HTML Upload

Locally saved webpage HTML is the primary passive data source

🥣

BeautifulSoup Parse

Extracts links, hosts, scripts, forms & external references

🌐

Host Discovery

Unique hostnames, subdomains & main domain identified

🛰️

OSINT Collection

DNS, WHOIS, SSL, IP, ASN data from public sources

📊

Risk Scoring

Severity-weighted baseline score across all findings

🤖

AI Advisor

OWASP-aligned recommendations via AI chatbot

Core Features

Eight Intelligence Modules
for Complete Exposure Mapping

Each module collects, analyzes, and presents a specific dimension of your organization's external digital footprint using only publicly accessible information.

🛰️

Passive OSINT Collection

Aggregates publicly available information from DNS records, WHOIS information, SSL certificate data, IP ownership, ASN details, and reverse DNS — all without sending intrusive packets to the target.

Core Engine
🌐

DNS & WHOIS Analysis

Collects record types A, AAAA, MX, NS, TXT, and CNAME along with WHOIS registration details such as domain name, registrar, creation date, expiration date, organization, and name servers.

DNS Intelligence
🔐

SSL Certificate Inspection

Shows certificate status, subject common name, issuer, validity dates, days remaining before expiration, serial number, and Subject Alternative Names (SAN) to detect misconfigurations.

TLS Analysis
🗺️

IP Ownership Mapping

Displays IP address, ASN, ASN description, network name, and country for all identified IP addresses — providing context about discovered hosting infrastructure and network ownership.

Network Intelligence
📡

External Exposure Analysis

Identifies publicly exposed assets, third-party dependencies, unresolved hosts, and potentially risky external services. Shows hostname, classification, risk level, resolved IP, and reverse DNS.

Host Analysis
📊

Baseline Risk Scoring

Generates baseline risk scores based on collected findings and categorizes them into Low, Medium, High, and Critical severity levels — evaluating exposed hosts, third-party dependencies, and detected findings.

Risk Engine
📱

Social Media Intelligence

Collects publicly available social media information related to organizations, domains, and employees to identify exposed organizational references and information that could support social engineering.

SOCMINT
🤖

AI Security Advisor

Provides AI-assisted analysis and remediation recommendations aligned with OWASP security practices. The AI chatbot uses collected OSINT findings and baseline risk analysis to generate contextual recommendations.

AI-Powered
Risk Model

Baseline Risk Scoring Model

CyberSherlock evaluates findings across multiple categories and assigns severity-weighted scores to produce a consolidated risk level for the organization's external exposure.

Finding CategorySeverityDescription
Zone Transfer EnabledCriticalAllows full DNS namespace enumeration by attackers
Expired SSL CertificateHighCertificate has passed its validity date — untrusted connection
SSL Certificate Expiring SoonHighCertificate will expire within 30 days — renewal required
Third-Party Hosts DetectedMediumExternal hosts referenced — may expand attack surface
WHOIS Admin Contact ExposedMediumPublic WHOIS reveals admin email and contact information
Missing Certificate InformationMediumSSL certificate data incomplete or unavailable
Social Media ExposureLowOrganizational or employee data discoverable via public profiles
Resolved External IP DetectedLowIP address publicly associated with target infrastructure
Live Dashboard

Real Dashboard Screenshots

The CyberSherlock dashboard presents collected OSINT data through a centralized single-page interface with organized sections, contextual explanations, and structured visual presentation.

1

Dashboard Overview — Summary Cards

Displays organization name, main domain, discovered subdomains, total hosts, unique IPs, and baseline risk score at a glance.

🔒 localhost:5000 — CyberSherlock Dashboard
CyberSherlock Summary Cards Dashboard
2

Security Findings & Exposure Analysis

Each finding is displayed with severity level (Low / Medium / High / Critical), category, title, and a simplified explanation for users with limited technical knowledge.

🔒 localhost:5000 — Security Findings Interface
Security Findings Interface
3

Domain Registration & Subdomain Information

Shows domain name, registrar, creation date, expiration date, organization, and name servers — alongside discovered subdomains from the saved HTML file.

🔒 localhost:5000 — Domain & Subdomain Information
Domain and Subdomain Information Interface
4

Host Analysis Interface

Shows each discovered host with its classification, risk level, risk score, resolved IP address, reverse DNS value, and the reason behind the assigned risk.

🔒 localhost:5000 — Host Analysis
Host Analysis Interface
5

SSL Certificate Information

Displays certificate status, subject common name, issuer, validity dates, days remaining before expiration, serial number, and Subject Alternative Names (SAN).

🔒 localhost:5000 — SSL Certificates
SSL Certificate Interface
6

DNS Records Interface

Displays record types A, AAAA, MX, NS, TXT, and CNAME with their values and a short explanation of each record type to help users understand the domain's public configuration.

🔒 localhost:5000 — DNS Records
DNS Records Interface
7

IP Ownership & Network Information

Displays IP address, ASN, ASN description, network name, and country for each identified IP — providing additional context about discovered infrastructure ownership.

🔒 localhost:5000 — IP Ownership & Network
IP Ownership Interface
8

CyberSherlock AI Security Advisor

An interactive AI chatbot that answers security questions and provides OWASP-aligned remediation recommendations based on collected OSINT findings and baseline risk analysis.

🔒 localhost:5000 — AI Security Advisor
AI Security Advisor Chatbot
How It Works

From HTML File to
Complete Intelligence Report

A structured six-stage pipeline that transforms a saved webpage HTML into a complete external attack surface analysis — all using passive techniques only.

01

Upload HTML Source

A locally saved copy of the organization's main webpage HTML file is provided as the primary data source. The Flask application verifies the file and reads its content into memory for processing.

02

Extract Hosts & Domains

BeautifulSoup parses the HTML and scans tags for attributes like href, src, action, data-src, and srcset. Hostnames are extracted, normalized, and deduplicated. The most-repeated registered domain is selected as the main domain.

03

Collect Passive OSINT

The Passive OSINT Data Collection Module queries public APIs and databases for DNS records, WHOIS information, SSL certificate details, reverse DNS, ASN details, and IP ownership information — all without direct interaction with the target.

04

Analyze External Exposure

Collected OSINT data is analyzed to identify publicly exposed assets, third-party dependencies, unresolved hosts, and potentially risky external services associated with the target organization.

05

Generate Risk Score

A severity-weighted scoring algorithm evaluates each finding — exposed hosts, third-party dependencies, missing information, and detected vulnerabilities — to produce a consolidated Baseline Risk Score from 0–100 with category breakdowns.

06

Social Media

This feature automatically identifies the target's presence across multiple platforms, including Instagram, X (Twitter), LinkedIn, and others. By scanning the HTML source, CyberSherlock extracts these social handles and links them together. This process helps in visualizing the full digital footprint of the target, making it easier to see how different accounts are interconnected and related to the main domain.

07

AI Recommendations

The CyberSherlock AI Security Advisor processes all findings and generates prioritized, context-aware remediation recommendations aligned with OWASP security practices to help users understand and mitigate identified risks.

AI Advisor — Real Interaction Log

CyberSherlock AI Security Advisor

The AI chatbot uses collected OSINT findings and baseline risk analysis to generate contextual, OWASP-aligned security recommendations. Below are actual interaction examples from system testing.

🤖

CyberSherlock Security Advisor

OWASP-aligned AI · Powered by collected OSINT findings

👤
Can you justify the current Risk Score? Tell me which factor contributed most to this level.
AI
Based on the EASM scan results provided, the overall Risk Score is 22/100 (Low). The main contributor is the "Third-party hosts detected" finding with a severity of Medium. This indicates one external host was found in the saved HTML, which may expand external exposure. The finding suggests dependencies on external resources that are outside direct organizational control.
👤
Provide a top-3 priority action plan for the IT team to reduce our external attack surface.
AI
[SECURITY ALERT] Based on the results, I recommend:

1. Vulnerability Remediation: Review identified third-party hosts (e.g., external domains found in HTML source) and assess whether they are necessary dependencies.

2. Network Segmentation: Isolate and restrict access to sensitive areas to minimize the impact of external dependency exposure.

3. Regular Security Audits: Schedule periodic passive OSINT monitoring to detect new threats and changes to your external attack surface in a timely manner.
👤
Given that "Third-party hosts" are the main risk factor, provide a technical recommendation to harden web assets and mitigate potential supply chain attacks.
AI
Technical Recommendation:

1. Inventory and Monitoring: Conduct a comprehensive inventory of all third-party libraries, CDNs, and external scripts referenced in your web assets.

2. Vulnerability Scanning: Perform regular passive scans against known CVE databases for all identified third-party dependencies to detect outdated or vulnerable components.

3. Subresource Integrity (SRI): Implement SRI hashes on all externally loaded scripts and stylesheets to prevent tampering — as recommended by OWASP A06: Vulnerable and Outdated Components.
Ethical Security Research

Built on Responsible Intelligence

CyberSherlock follows ethical cybersecurity practices by relying only on passive Open-Source Intelligence (OSINT) collection methods. The platform avoids intrusive scanning, penetration testing, exploitation activities, or direct interaction with target systems.

Active Scanning · Exploitation · Penetration Testing · ✓ Passive OSINT Only
🛡️

Zero Intrusion

CyberSherlock never sends probes or intrusive requests to target systems. All data is sourced from public records without generating detectable network traffic.

⚖️

Legal & Compliant

Operates exclusively using publicly accessible information only — no unauthorized access, no grey-area techniques, no credential harvesting.

📖

OWASP Aligned

All AI recommendations and risk assessments follow OWASP Top 10 and established security frameworks, validated before being presented to users.

Passive OSINT Only
OWASP Top 10 Aligned
No Active Scanning
No Penetration Testing
No Credential Harvesting
Responsible Disclosure
Tested on OWASP Juice Shop

Technologies Used

🐍 Python
⚗️ Flask Web Framework
🥣 BeautifulSoup
🌐 WHOIS / DNS APIs
🔐 SSL / TLS Analysis
📡 ASN & IP OSINT
🤖 AI / LLM Integration
⚖️ OWASP Standards
🧪 OWASP Juice Shop Testing

Understand your public exposure
before attackers do.

Upload an HTML source file and get a complete external attack surface analysis — passive, ethical, and OWASP-aligned.

🔍 Start Analysis

University of Bahrain · ITCY 499 Senior Project · Passive OSINT Only · No Active Scanning · OWASP Aligned