The Good Tech Companies - How to Accurately Evaluate IP Geolocation Data Beyond Provider Consensus
Episode Date: August 4, 2025This story was originally published on HackerNoon at: https://hackernoon.com/how-to-accurately-evaluate-ip-geolocation-data-beyond-provider-consensus. Learn how to evalu...ate IP data accuracy using ground truth, network physics, and IPinfo’s ProbeNet to avoid relying on outdated or misleading sources. Check more stories related to cybersecurity at: https://hackernoon.com/c/cybersecurity. You can also check exclusive content about #ip-data-accuracy, #ip-geolocation-evaluation, #ipinfo-probenet, #ground-truth-ip-data, #whois-limitations, #ip-geolocation-metrics, #hackernoon-top-story, #good-company, and more. This story was written by: @ipinfo. Learn more about this writer by checking @ipinfo's about page, and for more stories, please visit hackernoon.com. IP geolocation accuracy can’t be judged by consensus alone, as many providers rely on outdated WHOIS data. IPinfo uses ground truth datasets and physics-based methods like ProbeNet for precise results. A proper evaluation involves verified ground truth, RTT triangulation, and clear metrics, ensuring data accuracy beyond trust or majority opinion.
Transcript
Discussion (0)
This audio is presented by Hacker Noon, where anyone can learn anything about any technology.
How to accurately evaluate IP geolocation data beyond provider consensus.
By IPinfo, IP data provider.
When comparing IP geolocation providers, how do you know which one is right when they disagree?
The natural tendency is to trust what's familiar, usually data from your existing provider.
But what if your current source of truth,
IS actually wrong,
being an outlier doesn't mean data is inaccurate.
In fact, when it comes to IP geolocation,
it often means IP info is identifying the true location
while others are relying on outdated or self-reported data.
This post will provide a framework
for genuinely evaluating IP data accuracy beyond simply comparing providers against each other.
Why do providers give different answers?
Multiple providers may place the same IP address in different locations.
This happens because of
Dependence on outdated records
Many providers rely heavily on WHOIS data and geo-feeds, which are often self-reported and unverified.
Intentional misreporting. Hosting providers sometimes deliberately falsify server locations
to attract customers. We found a case where servers physically located in Amsterdam were
advertised as being in 27 different countries. Methodological differences. Some providers use
empirical measurements, while others depend on self-reported or static datasets. Why consensus isn't a reliable measure of
accuracy? It's tempting to assume that the majority is correct but in IP
geolocation that's often not true. Many providers pull from the same flawed
sources creating an illusion of agreement even when the data is wrong.
Outlier results aren't always mistakes,
in fact, they may indicate morigurus, reality-based measurement. For example,
in one case, the IP address 64.138.26.13 was placed in the US by every other provider,
but ping tests from multiple locations revealed the lowest RTTs came from Singapore,
evidence that the server was actually in Singapore. Our findings were later confirmed via WHOIS records showing
a local office of Hamanetics Corporation. You can read the full analysis in our post
on ping-based geolocation vs hoise records.
Measuring accuracy requires trusted ground truth. The gold standard for evaluating IP geolocation
accuracy is comparing against our reliable set of ground truth data IP
addresses with known verified locations. Here's how to establish and use ground
truth effectively. Sources of ground truth data organizations often have
access to various sources of ground truth. 1. Device location data. Mobile apps
or websites can collect GPS coordinates
alongside IP addresses,
creating a data set of known locations.
Two, customer reported locations.
Information gathered during signup,
verification, or from support tickets.
Three, corporate network data.
For enterprise clients,
the exact locations of office IP ranges are known.
4. Verification systems. Multi-factor authentication are login verification systems that confirm user locations.
Avoiding ground truth pitfalls not all ground truth is created equal. Watch out for these common issues.
Circular references. Ensure your ground truth isn't derived from IP geolocation
itself.
Some mobile SDKs fall back to IP-based location when GPS is unavailable.
Perfect match suspicion.
Be wary of ground truth data that aligns too perfectly with a specific provider's geolocation
data.
For example, if your device locations match exactly the coordinates that Mac's mind gives
for a city,
rather than showing natural variation within the city,
this could indicate the data source is using IP geolocation as a fallback rather than actual device location.
Legitimate GPS-based ground truth should show natural distribution patterns within cities and regions.
VPN and proxy usage. Users may connect through VPNs or proxies, causing their IP addresses
to appear from different locations than their physical presence. Self-reported inaccuracies.
Users may provide incorrect information about their location, either accidentally or deliberately.
Outdated information. IP assignments change frequently. Ground truth must be recent to
be valid. The importance of recency changes with the geographical resolution you're interested
in.
IPs change cities much more often than they change countries.
Mobile Carrier Challenges
Mobile carrier IPs represent a special case where traditional geolocation concepts can
break down.
Many carriers use carrier-grade NAT to share IPs among hundreds or thousands of devices,
sometimes across entire regions or countries.
In such cases, these IPs can't be reliably geolocated to a specific city or even region E, G.
A device in Seattle and another in Miami might simultaneously share the same IP address through their mobile carriers network.
It's important to note that not all mobile carriers behave this way. Some assign IPs in a more geographically aware manner.
Accuracy evaluation involving mobile carrier IPs should take these differences into account,
as IP assignment practices can significantly affect geolocation reliability.
Qualifying your ground truth to ensure your ground truth is reliable. 1. Verify the data collection methodology.
2. Understand how location was determined.
3. Check for timestamps to ensure recency.
4. Filter out known VPNs and proxies, and mobile carrier IPs.
5. Cross-validate with other confirmation methods when possible.
Defining accuracy metrics.
Once you have reliable
ground truth, you need consistent metrics to evaluate accuracy. Distance-based metrics
median distance error. The median distance between predicted locations and actual locations across
all IPs in your dataset. This metric is less affected by outliers than the mean. Mean distance
error. The average distance between predicted locations
and actual locations. This will be higher than the median if there are significant outliers.
90th percentile error. The maximum error for 90% of your dataset. This helps understand
the worst case scenarios while excluding extreme outliers.
Percentage based metrics accuracy within XKM. The percentage of IPs geolocated within a specific distance, e.g.
10 km, 50 km, 100 km, of their true location
correct city, region, country rate
the percentage of IPs assigned to the correct administrative division
coverage metrics percentage of IPs with results
some providers may return, unknown, for challenging IPs.
This metric helps understand overall coverage, confidence radius accuracy.
For providers that offer a confidence radius, what percentage of true locations fall within the stated radius?
If ground truth isn't available, physics doesn't lie.
When you don't have access to verified ground truth, network physics provides an independent verification method, speed of light.
The ultimate authority network communications are bound by physics,
specifically, the speed of light and fiber optic cables, approximately 200,000 km per second.
This means a server truly located in Amsterdam cannot respond to a ping from Amsterdam in less
than 1 millisecond while taking 150 milliseconds to respond from Singapore.
Round trip time, RTT, measurements from known locations provide empirical evidence that
can't be falsified, unlike documentation that can say anything.
Evidence through triangulation multiple measurement points can be used to triangulate an IP's
location. Similar to GPS, this approach creates a confidence radius around the predicted location.
By comparing RTT from numerous global vantage points, you can determine not just the country but often the city where an IP is located with high confidence.
Conducting independent verification you can perform this verification yourself.
1. Use tools like PING sx, checkhost.net to test from multiple global locations.
2. Sort by lowest rtt to determine likely actual location.
3. Consider physical constraints, a response from Amsterdam in less than 10 milliseconds
means the server is at least somewhat close to Amsterdam.
4. Look for supporting evidence in hostnames, ASN registration, and other metadata.
6. Step framework for proper IP data evaluation.
Combining ground truth with network physics verification gives you a robust framework,
common evaluation pitfalls to avoid. Assuming majority consensus equals accuracy,
most providers copy from the same sources.
Relying on visual spot-checking, a systematic approach is needed.
Testing only major cities, edge cases reveal the most about data quality.
Focusing on country-level only, city-level accuracy reveals true data quality differences.
Using stale ground truth, IP assignments change frequently. How IPinfo's
approach is different. IPinfo's methodology uses multiple measurement points to triangulate an
IPs location with ProbeNet, our internet measurement platform. Similar to GPS, this approach creates a
confidence radius around the predicted location. By comparing RTT from numerous global vantage points,
we can determine not just the country but often the city where an IP is located with high confidence.
When we find discrepancies between our measurements and what WHOIS records claim, we rely on the physics-based evidence.
You can read more about how we measure our accuracy against ground truth data here.
Unlike providers who primarily rely on WHOIS records
and geo feeds, IPINFO's approach centers
on empirical evidence.
Global ProbeNet, our 1000 plus points of presence
spread across the world provide actual measurement data.
Full internet wide scanning.
We perform billions of measurements weekly
to verify IP locations.
Ground truth validation.
We validate our predictions against known location datasets.
Transparency.
We show our evidence and explain when we disagree with other providers.
When providers disagree about an IP's location, we encourage you to check for yourself.
IP geolocation accuracy is too important to leave to trust or consensus.
By using a combination of reliable ground truth data and physics-based verification,
you can objectively determine which provider delivers the most accurate results for your
specific needs.
When providers disagree, don't assume the outlier is wrong, investigate using independent
verification and appropriate metrics.
As our research has demonstrated, being different often means being right when everyone else
is wrong.
For assistance with conducting a thorough evaluation of IP data providers, contact our
data team.
We're happy to help you set up a structured comparison based on evidence that meets your
unique requirements.
About the author.
Daniel Quant Daniel Quant leads the Solutions Engineering team at IPinfo, where he helps customers get the most out of internet data.
Before IPinfo, he worked in data science in the hospitality industry.
Thank you for listening to this Hacker Noon story, read by Artificial Intelligence.
Visit hackernoon.com to read, write, learn and publish.
