geoip_fl()
Applies to: ✅ Microsoft Fabric ✅ Azure Data Explorer
geoip_fl()
is a user-defined function that retrieves geographic information of ip address.
Note
- Use the native function geo_info_from_ip_address() instead of the function described in this document. The native function provides the same functionality and is better for performance and scalability. This document is provided for reference purposes only.
- This function retrieved geographic data from GeoLite2 data created by MaxMind, available from http://www.maxmind.com. Please review GeoLite2 End User License Agreement.
Prerequisites
- The Python plugin must be enabled on the cluster. This is required for the inline Python used in the function.
- The Python plugin must be enabled on the database. This is required for the inline Python used in the function.
Syntax
T | invoke geoip_fl(
ip_col,
country_col,
state_col,
city_col,
longitude_col,
latitude_col)
Learn more about syntax conventions.
Parameters
Name | Type | Required | Description |
---|---|---|---|
ip_col | string |
✔️ | The name of the column containing the IP addresses to resolve. |
country_col | string |
✔️ | The name of the column to store the retrieved country. |
state_col | string |
✔️ | The name of the column to store the retrieved state. |
city_col | string |
✔️ | The name of the column to store the retrieved city. |
longitude_col | real |
✔️ | The name of the column to store the retrieved longitude. |
latitude_col | real |
✔️ | The name of the column to store the retrieved latitude. |
Function definition
You can define the function by either embedding its code as a query-defined function, or creating it as a stored function in your database, as follows:
Define the function using the following let statement. No permissions are required.
Important
A let statement can't run on its own. It must be followed by a tabular expression statement. To run a working example of geoip_fl()
, see Example.
let geoip_fl=(tbl:(*), ip_col:string, country_col:string, state_col:string, city_col:string, longitude_col:string, latitude_col:string)
{
let kwargs = bag_pack('ip_col', ip_col, 'country_col', country_col, 'state_col', state_col, 'city_col', city_col, 'longitude_col', longitude_col, 'latitude_col', latitude_col);
let code= ```if 1:
from sandbox_utils import Zipackage
Zipackage.install('geoip2.zip')
import geoip2.database
ip_col = kargs['ip_col']
country_col = kargs['country_col']
state_col = kargs['state_col']
city_col = kargs['city_col']
longitude_col = kargs['longitude_col']
latitude_col = kargs['latitude_col']
result=df
reader = geoip2.database.Reader(r'C:\\Temp\\GeoLite2-City.mmdb')
def geodata(ip):
try:
gd = reader.city(ip)
geo = pd.Series((gd.country.name, gd.subdivisions.most_specific.name, gd.city.name, gd.location.longitude, gd.location.latitude))
except:
geo = pd.Series((None, None, None, None, None))
return geo
result[[country_col, state_col, city_col, longitude_col, latitude_col]] = result[ip_col].apply(geodata)
```;
tbl
| evaluate python(typeof(*), code, kwargs,
external_artifacts =
pack('geoip2.zip', 'https://artifactswestus.blob.core.windows.net/public/geoip2-4.6.0.zip',
'GeoLite2-City.mmdb', 'https://artifactswestus.blob.core.windows.net/public/GeoLite2-City-20230221.mmdb')
)
};
// Write your query to use the function here.
Example
The following example uses the invoke operator to run the function.
To use a query-defined function, invoke it after the embedded function definition.
let geoip_fl=(tbl:(*), ip_col:string, country_col:string, state_col:string, city_col:string, longitude_col:string, latitude_col:string)
{
let kwargs = bag_pack('ip_col', ip_col, 'country_col', country_col, 'state_col', state_col, 'city_col', city_col, 'longitude_col', longitude_col, 'latitude_col', latitude_col);
let code= ```if 1:
from sandbox_utils import Zipackage
Zipackage.install('geoip2.zip')
import geoip2.database
ip_col = kargs['ip_col']
country_col = kargs['country_col']
state_col = kargs['state_col']
city_col = kargs['city_col']
longitude_col = kargs['longitude_col']
latitude_col = kargs['latitude_col']
result=df
reader = geoip2.database.Reader(r'C:\\Temp\\GeoLite2-City.mmdb')
def geodata(ip):
try:
gd = reader.city(ip)
geo = pd.Series((gd.country.name, gd.subdivisions.most_specific.name, gd.city.name, gd.location.longitude, gd.location.latitude))
except:
geo = pd.Series((None, None, None, None, None))
return geo
result[[country_col, state_col, city_col, longitude_col, latitude_col]] = result[ip_col].apply(geodata)
```;
tbl
| evaluate python(typeof(*), code, kwargs,
external_artifacts =
pack('geoip2.zip', 'https://artifactswestus.blob.core.windows.net/public/geoip2-4.6.0.zip',
'GeoLite2-City.mmdb', 'https://artifactswestus.blob.core.windows.net/public/GeoLite2-City-20230221.mmdb')
)
};
datatable(ip:string) [
'8.8.8.8',
'20.53.203.50',
'20.81.111.85',
'20.103.85.33',
'20.84.181.62',
'205.251.242.103',
]
| extend country='', state='', city='', longitude=real(null), latitude=real(null)
| invoke geoip_fl('ip','country', 'state', 'city', 'longitude', 'latitude')
Output
ip | country | state | city | longitude | latitude |
---|---|---|---|---|---|
20.103.85.33 | Netherlands | North Holland | Amsterdam | 4.8883 | 52.3716 |
20.53.203.50 | Australia | New South Wales | Sydney | 151.2006 | -33.8715 |
20.81.111.85 | United States | Virginia | Tappahannock | -76.8545 | 37.9273 |
20.84.181.62 | United States | Iowa | Des Moines | -93.6124 | 41.6021 |
205.251.242.103 | United States | Virginia | Ashburn | -77.4903 | 39.0469 |
8.8.8.8 | United States | California | Los Angeles | -118.2441 | 34.0544 |