WHOIS (RFC 3912) Protocol Analyzer
Spicy-based WHOIS (RFC 3912) protocol analyzer for Zeek.
Detailed Description
WHOIS is a basic TCP request/response protocol: client sends one query line, server returns free-form text and closes.
This analyzer employs intelligent interpretation to both halves of the exchange, formulating a structured whois.log.
It classifies the query as domain, ipv4, ipv6, or asn, then reads the reply (capped at 64 KB) and scans it for registry/RIR fields: owner, status, origin AS, registration, update and expiry dates, name servers, abuse contact.
Features
- Logs WHOIS queries and structured reply metadata to
whois.log - Dynamic protocol detection (DPD) via bidirectional signatures
- Reply time tracking (request-to-reply delta)
- Weirds for protocol anomalies (empty requests, unusually large queries)
- UTF-8/IDN support (tested against JP, CN, KR WHOIS servers)
Detection use cases (examples)
- Sinkhole / seizure â
statusofserverHoldorclientHoldmarks a domain the registry has frozen. - Routing intelligence â
origin_ason a network query is the BGP-filter input; flag route objects whose origin AS doesn't match expected peering. - Fresh infrastructure â a
registereddate inside your lookback window flags newly-stood-up domains; a shortregistry_expiry(1-year registration) sharpens the signal. - Infrastructure pivot â
name_serverties a domain to its DNS hosting; pivot to related domains sharing a name server.
Requires
- Zeek 6.1.0 (bundled with Spicy 1.9.0) minimum
- C++ toolchain and libpcap headers are required to build the analyzer:
gcc g++ make cmake libpcap-dev- As with any zkg Spicy analyzer, the code is Spicy source and compiled at install time
- NOTE: The official
zeek/zeekcontainer image omits these, so install first or the build will fail withpcap.h: No such file or directory
Install
zkg package, from Zeek Package Source:
zkg install spicy-whois
Events
event WHOIS::request(c: connection, is_orig: bool, query: string)
Raised for each client query, with query holding the string stripped of its line terminator.
event WHOIS::reply(c: connection, is_orig: bool, data: string)
Raised once per reply, with data holding the full server text (read until close, capped at 64 KB).
Where the events above hand back raw bytes, WHOIS::log_whois(rec: WHOIS::Info) is where the analyzer interpretation executes: once per connection it emits the assembled WHOIS::Info record â query classified, reply fields extracted â that is written to whois.log.
See WHOIS answer schema for fields.
Example output
Run with testing pcap, pretty-print whois.log with jq:
zeek -C -r testing/Traces/whois-domain.pcap whois.hlto scripts/__load__.zeek LogAscii::use_json=T
jq --color-output . whois.log
domain lookup (whois-domain.pcap) â registrar, EPP status codes, name servers, abuse contact:
{
"ts": 1779334478.346291,
"uid": "Cm3FuO2WPLUSPqUolb",
"id.orig_h": "192.168.1.231",
"id.orig_p": 63154,
"id.resp_h": "192.34.234.30",
"id.resp_p": 43,
"query": "domain cloudflare.com",
"query_type": "domain",
"resource": "CLOUDFLARE.COM",
"owner": "Cloudflare, Inc.",
"registered": "2009-02-17T22:07:54Z",
"updated": "2024-01-09T16:45:28Z",
"registry_expiry": "2033-02-17T22:07:54Z",
"name_server": [
"ns3.cloudflare.com",
"ns4.cloudflare.com",
"ns5.cloudflare.com",
"ns6.cloudflare.com",
"ns7.cloudflare.com"
],
"status": [
"clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited",
"clientTransferProhibited https://icann.org/epp#clientTransferProhibited",
"clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited",
"serverDeleteProhibited https://icann.org/epp#serverDeleteProhibited",
"serverTransferProhibited https://icann.org/epp#serverTransferProhibited",
"serverUpdateProhibited https://icann.org/epp#serverUpdateProhibited"
],
"abuse_contact": "registrar-abuse@cloudflare.com",
"reply_time": 0.025169849395751953,
"reply_size": 3719
}
network lookup (whois-net.pcap) â the same record shape pivoted on query_type, here an RIR inetnum with server_name and origin_as populated and the domain-only fields absent:
{
"ts": 1779334777.802331,
"uid": "CBdloO3gjjCrOi6Q5l",
"id.orig_h": "192.168.1.231",
"id.orig_p": 64829,
"id.resp_h": "193.0.6.135",
"id.resp_p": 43,
"query": "95.217.0.1",
"query_type": "ipv4",
"server_name": "RIPE",
"resource": "95.217.0.0 - 95.217.15.255",
"owner": "ORG-HOA1-RIPE",
"origin_as": "AS24940",
"registered": "2023-12-12T12:40:45Z",
"updated": "2023-12-12T12:40:45Z",
"status": [
"ASSIGNED PA"
],
"reply_time": 0.16294193267822266,
"reply_size": 3800
}
Analyzer: Attachment, confirmation, and ports
A connection is logged only after two steps: the analyzer attaches to it, then the parser confirms the bytes are WHOIS.
Attach happens on 43/tcp. Analyzer::register_for_ports binds the analyzer to that port, so every connection on 43/tcp gets the analyzer at connection start, before any payload is parsed.
Confirm happens in the parser, independently on each side. A query line that parses calls spicy::accept_input(); a reply that carries data does the same. Either alone confirms, so a client query with no reply still tags the connection. A parse failure on either side calls zeek::reject_protocol() instead.
Confirmation, not the port match, is what sets service=whois in conn.log. Non-WHOIS traffic on 43/tcp still gets the analyzer attached, but never confirms, so service stays empty.
The full path is port â attach â parse â accept_input() confirms â service=whois.
DPD signature
The signature in scripts/dpd.sig is a third, independent mechanism: a content-based attach path for non-standard ports.
WHOIS has no constant byte pattern or fixed-offset header to key against, so the signature pairs a client and server match tuned against the captured bytes in testing/Traces/. The server side fires only after the client query matches (requires-reverse-signature), avoiding false positives from other text protocols whose replies carry a stray keyword.
tcp-state originator/responder is used without established, matching core analyzers; payload exists only post-handshake, so established is redundant.
Client (originator) â a single query line ending in CRLF:
- Character class covers domain/IP/ASN chars plus flag punctuation (
- . @ = / + : ,) and a literal space, so RIPE-style flag queries like-T dn,ace example.dematch. - No
\sin the class. An earlier version included it, silently matching internal\r/\nâ so multi-line payloads (foo.com\r\nbar.com\r\n) and bare CRLF floods registered as valid single queries. Dropping\srejects them. - Underscore is excluded: it let SSH banners (
SSH-2.0-libssh_â¦) match. \x80-\xffis kept for IDN/CJK queries.
Server (responder) â keyword match, gated by requires-reverse-signature:
- Matches keywords present in real registry/registrar/RIR replies.
- Includes
route6?:(not justroute:) plusorigin:/source:, closing a gap where IPv6route6:objects from RIPE/RADB went undetected.
WHOIS on a non-standard port
To parse WHOIS off 43/tcp, add the port so the analyzer attaches there at connection start, the same path it uses on 43/tcp:
redef WHOIS::ports += { 4343/tcp };
Parsing limits and bounds
Cutoff bounds protect against malformed / hostile traffic:
- Request line (
whois.spicy) â printable bytes (\x09,\x20â\x7e,\x80â\xfffor IDN), terminated by an optional CR and a required LF. An empty query raiseswhois_empty_request; a query over 512 bytes raiseswhois_oversized_request(the line still parses â the weird is the signal). - Reply body â read to close, capped at 64 KB (
&size=65536 &eod); the first 64 KB parse, bytes past the cap are discarded, soreply_sizetruncates. - Field extraction (
main.zeek) â reply split on LF, each line on its first:; keys lowercased, values stripped, empties skipped. Single-valued fields are first-wins;statusandname_serveraccumulate into a set (name_serverlowercased to dedup), bounded by the 64 KB cap.
WHOIS answer schema
Answers come in two forms, both mapped into one set of generic fields pivoted on query_type:
- domain responses (registrar/registry data)
- network responses (RIR inetnum/route/ASN objects)
Always read a value alongside query_type â the same column carries different elements per type (owner is a registrar for a domain, an mnt-by maintainer for a network).
| Field | Domain response | Network response | Why it matters to a defender |
|---|---|---|---|
query |
the query string | the query string | What was looked up |
query_type |
domain |
ipv4 / ipv6 / asn |
Split registrar lookups from routing-intel lookups |
server_name |
â | source registry (RIPE, ARINâ¦) | Which database answered |
resource |
Domain Name |
NetRange / CIDR / inetnum / route |
The object the response describes |
owner |
Registrar |
org / mnt-by maintainer |
Who controls the resource |
origin_as |
â | OriginAS / origin: |
BGP-filter input the CCC RIPE talk focuses on |
registered |
Creation Date |
RegDate / created: |
Age â new registrations are suspicious |
updated |
Updated Date |
last-modified |
Recent repoint/takeover signal |
registry_expiry |
Registry Expiry Date |
â | Short (1-year) registrations are a hunting signal |
name_server |
Name Server (set) |
â | DNS hosting + pivot to related domains via shared NS |
status |
EPP codes | â | serverHold/clientHold = seized/sinkholed |
abuse_contact |
Registrar Abuse Contact Email |
â | Abuse reporting + bulletproof-registrar fingerprinting |
reply_time |
requestâreply delta | requestâreply delta | Latency â tunneling/abuse signal |
reply_size |
total bytes | total bytes | Volume, without storing the blob |
Protocol Reference
License
BSD-3-Clause, see COPYING.
Credits
Created
- Craig P (@detection-labs)