The Good Tech Companies - How IPinfo Turns Registry Data into Real Intelligence

Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. How IP Info turns registry data into real intelligence by IP info, IP data provider. The internet is a decentralized network by nature. Hundreds of thousands of computer networks, each with their own peculiarities and policies, are interconnected together to form a larger, global network. There is little to no coordination between these networks. Each network IS mainly concerned with its peering neighbors, but it doesn't need to care about networks further away. A network operator in Japan does not need to ever coordinate with a network operator in Europe to be able to exchange packets with them. They do not need

Starting point is 00:00:40 to use the same equipment, vendors, software, or configurations on their networks. It might seem impossible for such a system to work at a global scale and over a long period of time, and yet, for the most part it does. This is achieved in part thanks to standards organizations like the IETF, which dictates the technical standards that must be used to interconnect networks, but also through the regional internet registries, RIRs, which coordinate the allocation of iPad addresses and other shared internet resources. In this blog post I will explain what IP addresses and autonomous system number sorry, how they are allocated successfully on the internet, along with their limitations, and how IP Info's actively measured data sets turn static registry data

Starting point is 00:01:22 into a verifiable, real-time intelligence layer. IP network challenges. The common point to all networks on the Internet is the use of the aptly named Internet protocol or IP protocol. In an IP network, each piece of equipment IS assigned an IP address that can be used to send messages to others. There are two key challenges in IP networks, finding paths or routes, between two computers.

Starting point is 00:01:46 Making sure each piece of equipment is assigned a unique address amongst a finite set of possible addresses. To better understand the routing challenge, let's have a closer look at an IP network. A typical IP network is made up of multiple computers interconnected together through a device called a switch. Each interface of the switch is either connected to a computer or to another switch. Two IP networks within each network hosts are connected through a switch and can send broadcast messages to all other computers in the same network. Broadcast messages are not allowed past the network boundary. Instead, a device called a router is used. A router knows in advance how to reach all

Starting point is 00:02:24 other networks. A switch has two modes of operation when it receives a message from one computer to another. Either it knows, from observing the traffic, to which specific interface the message must be forwarded, or it doesn't and it sends the message to all interfaces except the interface where the message arrived in a process called flooding. In this setup, there is no need to know the outgoing interface for all destinations in advance since a message can always be broadcast to everyone when the interface is not known. Networks that support broadcast messages are convenient, but they also suffer from scalability issues. Imagine if every time a computer wanted to send a message to a new destination on the internet, the message would be sent to all other computers on

Starting point is 00:03:04 the internet in the hope that it would eventually reach its intended destination. Not only would this be a privacy nightmare, this would also clog the network with useless traffic. From local networks to autonomous systems, to solve this scalability issue, broadcast messages are not allowed between networks. Instead, another type of device, called a router, is used to interconnect networks. Unlike a switch, a router must know the outgoing interface for each destination in advance. One way to populate this routing table is for the network to manually input each route into all routers. However, this process doesn't scale to large networks and doesn't allow dynamic reconfiguration when new routers are added or removed, or when links

Starting point is 00:03:46 between them become unavailable, e. G, due to a broken cable. Instead, the most common way of populating routing tables is by using a routing protocol. Various kinds of routing protocols exist, but the main idea is for each router to share the list of the network it is connected to with its neighbors and also the networks it learned from its neighbors. After some convergence time, routers should share the same view of the network. In practice, different organizations will use different routing protocols and policies depending on their needs, for example, to optimize for throughput or latency. They might also want to filter which routes they accept from other networks to control where their traffic will go. A collection of networks that share

Starting point is 00:04:26 a common routing policy is called an autonomous system or as. Typically ISPs and large organizations will operate one or more as is. Within each as they are free to use. the routing protocol of their choice. The internet is a collection of autonomous systems, each with their own internal routing protocols and policies that use the border gateway protocol, BGP, to exchange routes between them. A proto-internet with two autonomous systems, inside-in as the network operator is free to implement any routing policy and use any routing protocol. Between as is the BGP routing protocol must be used. One key requirement of the BGP protocol is that each has a unique number, called an autonomous system number, ASN, assigned to it.

Starting point is 00:05:11 The need for unique resources. For the internet to work, we need two things, unique IP addresses and unique autonomous system numbers. If we let network operators pick these values atrandom, there is a very high risk of collision, which would result in parts of the network being unreachable or receiving traffic not meant for them. The allocation of these resources is coordinated by the regional internet registries, RIR. RIRs are non-profit organizations that maintain a registry, or database, of internet resources. There are five RIRs that cover different parts of the world. AFRINIC for Africa, ARIN for North America, APNIC for Asia and Oceania, Lacknick for South America, and the RIPNCC for Europe and the Middle East. The exact types of records and metadata

Starting point is 00:05:58 recorded in each database depend on the RIR, but the two most important are IP address ranges in autonomous system numbers. Resource allocation is hierarchical in nature. The internet-assigned numbers authority, YANA, distributes these resources to RIRs, which in turn distributes them to organizations who need them, which in turn can distribute them to their customers. IPV4 addresses are by far the scarcest resources, with only 4.2 billion 32 byte addresses to split for every device, on the planet. RIRs now have a waiting listing place, as they've run out of available IPV4 addresses and have strict quotas. For example, the Ripe NCC limits the maximum number of IPV4 addresses allocated to an ISP to 256 addresses. IPV6 addresses do not suffer from

Starting point is 00:06:47 this problem thanks to the use of 128 byte addresses, which provides a vastly larger address space. The majority of the IPV6 address space is currently unallocated and reserved for future use. ASNs used to be defined using 16-bit numbers, meaning a maximum of 65,536 ASNs could be allocated, but were later extended to 32 bits to account for the growing number of ASNs, more than 120K as of 2025. The current allocation of these resources to RIRs can be seen in the YANA IPV-4 address space registry, the YANA IPV-6 Global Unicast Address Assignments, and THE YANA Autonomous System Numbers list. The exact resource allocation policy depends on each RIR, but they will typically distinguish

Starting point is 00:07:33 two types of allocations, ISPs and end users. ISP allocations can be further sub-allocated by the ISP to its own customers. We can also say that the ISP acts as a local internet registry, LIR. For these kinds of allocations, ISPs usually pay a membership fee that funds the registry operation. On the other hand, end-user allocations are usually exempt of membership requirements, but have the restriction that they cannot be sub-allocated. Let's have a look at a concrete example, with the 2A-01-C-910-1-20E. Skeptical Smiley Face, 64 IPV-6 range. The allocation hierarchy is as follows. We clearly see the Yana right-pointing arrow RIR, right-pointing arrow LIR's right-pointing

Starting point is 00:08:21 arrow end user allocation hierarchy. Who needs to own resources? One might wonder why bother obtaining IP addresses from RIR or LIR when an organization could just use the IP addresses handed out by their ISP? The main reasons for owning IP addresses are portability. The ability for an organization to move IP addresses across ISPs without renumbering all of its devices. I.E. Having provider independent address space. Multihoming. Announcing announcing IP addresses from multiple ISPs for redundancy. Additional benefits can be reputation. By controlling how IP addresses are used, the owner can make sure they are not used for nefarious purposes. E. G. Sending spam and implement its own abuse handling policy

Starting point is 00:09:09 by setting point of contact fields in the WHOIS database. Reverse DNS delegation. The YANA maintains the root zone for reverse DNS records in ADDR. ARPA and IP6, ARPA, and delegates the child zones to the RIRs, which in turn can delegate the zones matching the owned resources to the organization's own DNS servers. See Ripe NCC's documentation, for example. The reason for obtaining an as number is simpler, it is required to establish public BGP sessions. Thus, any network willing to announce IPs on the internet must have its own ASN. Note that IPs addresses and ASN allocations are independent. An organization might have an ASN, but only root IPs brought by its customer, and similarly an organization can own IPs, but not have the infrastructure

Starting point is 00:09:57 to root them and relian an ISP with an ASN for that. In practice, most smaller organizations are fine using IP space assigned by their ISP and don't need their own ASN. Ownership becomes important for companies that run large-scale infrastructure, provide internet services, or need independence and resilience, such as ISPs, hosting providers, CDNs, and enterprises with complex networks. For these organizations, direct resource ownership offers control, stability, and flexibility that simply leasing addresses from an ISP cannot match. What WHOIS and RIR data reveal? IP address and ASN records in the RIR databases contain information about the owner of these resources, but they can also contain additional metadata like the intended usage of these resources

Starting point is 00:10:45 or contact addresses. Let's have a look at some resources. To do so, we will use the W-H-O-I-S protocol with the Hoy's command line utility, but note that there is also a newer HTTP-based protocol called RDAP. We can query the Ripe database for the 2A-01-910-1-20E. Skeptical Smileyface, 64 range with the following command. Hoy's H-Hoy's. Ripe. Net 2A.01. C. 910 to 1. 20E. Sceptical Smiley Face, 64 if the W.H.O.I.S. Server is not specified, Hoys. Ripe. Net in this case, then the Whois command will use the YANA server, Hoys. Yana, org, which will redirect to the appropriate RIR server in most cases. However, it's good practice to specify the proper W.HOIS server to make sure you get the expected record. allocation in the Ripe database. The record tells us multiple things. The name of the IP range gives us a hint as to its purpose. In this case, it seems allocated to the French post. The country tells us that this range is probably used in France, although this attribute is completely free for

Starting point is 00:11:57 the user to set and might be unrelated to IP geolocation. See the next section. The admin C and TechC attribute give us the administrative and technical contacts for this IP range. In this case, A82914 RIPE links to a record containing the addresses of Orange Business Services, the LIR support center while DL 7113 RIPE tells us the address of the French Postal Service IT services. The MNT by attribute tells us the entity allowed to make changes to the Inetnam record itself. Naturally in this case it's the LIR, Orange Business Services. The roll records linked in the admin C, TechC and MNT by attributes contain postal addresses. phone numbers and email addresses. For example, if we look upon a 2914, RIP, point of contact in the

Starting point is 00:12:46 RIP database, W-H-O-I-S and R-I-R-R limitations, W-H-O-I-S does not equal GEOL-O-CATI-O-N-W-H-I-S records are often used as a proxy for IP geolocation. However, this US-Ages precarious as there is no guarantee that the addresses in the records are recorrect or linked to the actual location where the IP addresses are used. For example, the Ripe database documentation has this to say regarding the country attribute. This identifies a country using the ISO 3166 minus two letter country codes. It has never been specified what this country represents. It could be the location of the head office of a multinational company or where the server center is based or the home of the end user. Therefore, it cannot be used in any reliable Way Tomap IP addresses to countries. Furthermore, the granularity of

Starting point is 00:13:38 IP address allocation is often too coarse to represent anything below the country level. The proper solution for network operators to share geolocation information are geofeeds. Root objects are informational only another common source of confusion regarding WHOIS records come from root records. These records tell us which ASNs are expected to announce an IP range. For example, the following record tells us that 90-38-016th should be announced by as 3,200, route record in the Ripe Database. However, these records are purely informational. Some network operators might use them to reject routes where the origin ASN doesn't match, but

Starting point is 00:14:20 some might not use those records at all. Furthermore, these records can exist even if an IPrange is not actually announced. As such, these records cannot be used tootermen which ASN announces an IP range. This information can, however, be obtained from the BGP routing protocol. The Messiness of public registry data, the RIR's databases are publicly available, which makes them great sources of information about the entities operating on the internet. However, this data can also be quite messy. RIRs do not all support the same record types, and even for common records, the attributes supported might be different. Furthermore, some fields, like the address, are freeform and their validity is not enforced by the RIRs. It is not uncommon to find invalid or

Starting point is 00:15:06 mangled addresses in WHOIS records. How IP Info transforms static records into dynamic intelligence, the Internet's addressing system was built on a cooperative trust model, registries publish allocations, operators document routing intentions, and the community assumes these records are correct. However, this model also allowed some actors to disguise malicious or unwanted traffic. For example, web crawlers can rotate IP addresses and ASN's often tovade detection. See this. Cloudflare blog post for a recent example. IP info data sheds light into these issues with our rich data. IP hoys RIP data maps IP ranges to their owners in a standardized and enriched

Starting point is 00:15:48 format across all regional internet registries. This data is useful to understand the allocation hierarchy and history of an IP address and, for example, to distinguish stable, ISP allocated IPs from IPs from IP brokers who might change hands frequently. This data can also be used to to discover assets owned by a particular organization and estimate their attack surface for proactive management. IP to company and IP ranges are IP to company and IP ranges dataset are distilled versions of our Hoy's data set that always return the most specific information for an IP address. They are useful when one cares only about the final owner of an IP address and not the full allocation hierarchy. They allow users to easily obtain all IP ranges belonging to an organization

Starting point is 00:16:35 by looking up its name or domain name. IP to ASN are IP to ASN data maps IP addresses to their origin ASN, the organization that hurts the IP address on the internet. This data can be used to know the ISPOFA user, but also to filter human traffic from bot traffic often originating from hosting in cloud providers ASNs. IP Geolocation are IP Geolocation data, informed by ProbeNet, our internet measurement platform, provides accurate location attribution

Starting point is 00:17:04 down to the city level, detecting discrepancies between WHOIS country codes and the IPs observed geography. IP privacy detection are IP privacy detection data flags IPs using VPNs, proxies, tour nodes, residential proxies and other anonymizing services, enabling users to filter bot traffic, prioritize investigations, and handle anonymized connections with the appropriate caution. From assumptions to evidence, traditional registry data is valuable but incomplete. Without validation, it's easy to misattribute ownership, geography, or routing, leading to costly blind spots. IP info uses publicly available signals, like WHOIS data and geo-feeds, standardizes and organizes that information into a consistent format, and builds off that foundation with partner, vendor data as well as our own proprietary

Starting point is 00:17:55 intelligence. This approach turns registry data from a static directory into a dynamic, verifiable intelligence layer. It enables network defenders, researchers, and operators to move from assumption-based analysis to evidence-backed mapping of internet resources, reducing false positives and uncovering infrastructure that would remain hidden in traditional WHOIS-only workflows. Ready to see beyond registry records, static WHOIS and RIR data only tell part of the story. IEP Info's datasets combine registry information with real-time measurements, advanced privacy detection, and curated intelligence to give you a complete view of the internet's infrastructure. About the author, Maxine Moosey Max currently works as a data engineer on IP Geolocation at IP Inf. Previously,

Starting point is 00:18:43 he was a postdoctoral researcher at Sorbonne University where he worked on multipath tracer out measurements. Thank you for listening to this Hackernoon story, read by artificial intelligence. Visit hackernoon.com to read, write, learn and publish.

The Good Tech Companies - How IPinfo Turns Registry Data into Real Intelligence

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.