ods-explore

Query API Reference

Contents

Main interface

All of ods_explore’s functionality can be accessed with an instance of opendatasoft.Opendatasoft.

class opendatasoft.Opendatasoft

ods_explore.opendatasoft.Opendatasoft(subdomain='data', base_url=None, session=None, api_key=None, lang='en', timezone='UTC')

base_url

The resolved base API URL.

catalog

An instance of query.CatalogQuery, the top-level querying interface, as described in Making queries below.

session

The session object.

Making queries

All queries are created using the catalog attribute of an instance of opendatasoft.Opendatadoft. Four key endpoints in the Catalog API and Dataset API are supported:

# query datasets in a catalog
ods.catalog.datasets

# read one dataset
ods.catalog.dataset(dataset_id='doc-geonames-cities-5000')

# query records in a dataset, given its `dataset_id`
ods.catalog.dataset(dataset_id='doc-geonames-cities-5000').records

# read one record, given its `dataset_id` and `record_id`
(
  ods
  .catalog
  .dataset(dataset_id='doc-geonames-cities-5000')
  .record(record_id='24eec8bff4f5b55afdeeeacb326167ed6b1e933a')
)

The datasets/records attributes, and dataset()/record() methods, all return new instances of query.DatasetQuery or query.RecordQuery. With these, you can refine your search using any number of chainable methods, or retrieve results by calling a query evaluation method.

Query API

Methods that return new Queries

Since the methods below return new Queries, they’re chainable:

import ods_explore.language as lang

(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .filter(country_code='CA', timezone='America/Vancouver')
  .exclude(population__gt=lang.avg('population'))
  .order_by('name')
)

This query adds a filter, exclusion, and ordering to the cities in the doc-geonames-cities-5000 dataset. The final result contains all Canadian cities in the America/Vancouver timezone, except for those whose population is greater than the average population, in alphabetical order by name.

filter

filter(*args, **kwargs)

Returns a new Query containing objects that match the given lookup parameters. The lookup parameters (**kwargs) should be in the format described in Field lookups below. Multiple parameters are joined via AND in the underlying ODSQL expression.

If you need to execute more complex queries (such as parameters joined with OR), you can use Q() objects or raw OSQQL (*args).

exclude

exclude(*args, **kwargs)

Like filter(), but returns a new Query containing objects that do not match the given lookup parameters.

select

select(*args, **kwargs)

Returns a new Query containing objects whose fields are limited to the given expressions. Expressions (*args) can be field names, aggregation functions, scalar functions, or F() objects, and can be combined with arithmetic operators.

To specify custom labels, use named expressions (**kwargs).

import ods_explore.language as lang
from ods_explore.query import F

# the name and population
(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .select('name', 'population')
)

# double the population, labelled 'double_population'
(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .select(double_population=F('population') * 2)
)

order_by

order_by(*args)

Returns a new Query with a given ordering. To indicate descending order, prepend field names with -. To order randomly, use ?.

# cities in alphabetical order by name
ods.catalog.dataset('doc-geonames-cities-5000').records.order_by('name')

# cities in descending order by population
ods.catalog.dataset('doc-geonames-cities-5000').records.order_by('-population')

# cities in a random order
ods.catalog.dataset('doc-geonames-cities-5000').records.order_by('?')

refine

refine(**kwargs)

Returns a new Query containing objects that match the given facet values (**kwargs).

A catalog’s available facets and a list of possible values for each facet can be enumerated by directly calling List facet values in the Explore V2 API. ods_explore does not currently provide an interface for this endpoint.

ignore

ignore(**kwargs)

Like refine(), but returns a new Query containing objects that do not match the given facet values (**kwargs).

Here, **kwargs is compatible with the in field lookup, so you may ignore multiple facet values at once.

 

Methods that evaluate Queries and return something other than a Query

get

get(as_json=False, **kwargs)

Returns results matched by the query as objects, or as dictionaries if as_json is True. For Queries that read one dataset or one record, a single object is returned, otherwise a list of objects.

Custom querystring parameters (such as limit or offset) can be added to the underlying API call with **kwargs.

count

count()

Returns the number of results matched by the query.

exists

exists()

Returns True if the query contains any results, and False if not.

iterator

iterator(batch_size=100, as_json=False)

Returns an iterator over results matched by the query as objects, or as dictionaries if as_json is True.

The number of results to retrieve per API call is adjustable with batch_size.

all

all(batch_size=100)

Returns all results matched by the query as a list of objects.

The number of results to retrieve per API call is adjustable with batch_size.

dataframe

dataframe(batch_size=100, **kwargs)

Returns results as a Pandas DataFrame, passing **kwargs to the underlying pandas.json_normalize() call.

The number of results to retrieve per API call is adjustable with batch_size.

first

first()

Returns the first object matched by the query.

last

last()

Returns the last object matched by the query.

aggregate

aggregate(*args, **kwargs)

Returns a dictionary of aggregate values. Expressions (*args) are aggregation functions that specify a value to be included in the output. To specify custom labels, use named expressions (**kwargs).

 

Helpers

The following are attributes and methods of Query instances.

url

url(**kwargs)

Returns the URL of the underlying API call that the query would make, useful for debugging ods-explore library code.

Custom querystring parameters (such as limit or offset) can be added to the underlying API call with **kwargs.

decoded_url

decoded_url

Like url(), but returns the decoded URL, with plus signs replaced with spaces and %xx escapes replaced with their single-character equivalents.

 

Field lookups

Field lookups are how you specify the core of an ODSQL where clause. Using the key format <field name>__<field lookup>, they’re passed as keyword arguments to the Query methods filter() and exclude(), and to Q() objects.

contains

Case-insensitive word containment.

# matches 'La Lima', 'Palos de la Frontera', and 'Shangri-La', but not 'Las Vegas'
(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .filter(name__contains='la')
)

# matches 'Santiago de la Peña', 'La Puebla de Almoradiel', and 'Saint-Jean-de-la-Ruelle'
(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .filter(name__contains='de la')
)

exact

Exact match (the default lookup behaviour when no field lookup is used).

ods.catalog.datasets.filter(dataset_id='doc-geonames-cities-5000')

# is equivalent to

ods.catalog.datasets.filter(dataset_id__exact='doc-geonames-cities-5000')

gt

Greater than.

(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .filter(population__gt=500)
)

gte

Greater than or equal to.

lt

Less than.

lte

Less than or equal to.

in

In a given iterable, usally a list or tuple.

(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .filter(country_code__in=['CA', 'FR'])
)

inarea

In a geographical area (for geo_point fields only).

The literals, helpers, filter functions, and enums described below are provided in the ods_explore.language module.

One of the following filter functions that should be used in conjunction with this field lookup. In each case, the first argument is a Geometry literal that describes a geographical area. This is created with the geom(geometry) helper, where geometry is a WKT/WKB or GeoJSON geometry expression as a string or dictionary.

polygon(area)
Limit results to a geographical area.

geometry(area, mode=Set.Within)
Limit results to a geographical area, based on a given set mode.

circle(center, radius, unit=Unit.METERS)
Limit results to a geographical area defined by a circle.

isnull

Is null (accepts True or False).

 

Aggregation functions

ods-explore provides the following aggregation functions in the ods_explore.language module, which can be provided as arguments to the aggregate() query evaluation method.

avg

avg(field)

Returns the average value of a numeric field.

count

count(field=None)

Returns the number of non-null values of a field, or the total number of results matched by the query if no field is provided.

envelope

envelope(field)

Returns the convex hull (envelope) of a geo_point field.

max

max(field)

Returns the maximum value of a numeric or date field.

medium

medium(field)

Returns the median value (50th percentile) of a numeric field.

min

min(field)

Returns the minimum value of a numeric or date field.

percentile

percentile(field, percentile)

Returns the nth percentile of a numeric field.

sum

sum(field)

Returns the sum of all values of a numeric field.

 

ods-explore provides the following tools in the ods_explore.query module.

Q() objects

A Q() object represents an ODSQL condition that can be used in filter() and exclude(). They make it possible to define and reuse conditions, and can be used to perform complex queries when combined with the logical operators & (AND), | (OR), and ~ (NOT).

# cities whose population is less than 6000 or greater than 7000
(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .filter(Q(population__lt=6000) | Q(population__gt=7000))
)

F() objects

An F() object represents the value of an object field, and makes it possible to refer to its value without having to retrieve it from the catalog. They make it possible to define conditions based on field values, and can be combined with the arithmetic operators +, -, *, and /.

# select the average elevation (digital elevation model = dem) in meters (the default unit) and in kilometers
(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .select(elevation_m='dem', elevation_km=F('dem') / 1000)
)

# cities whose population is less than their average elevation
(
  ods
  .catalog
  .dataset('doc-geonames-cities-5000')
  .records
  .filter(population__lt=F('dem'))
)

 

Objects

The following are object representations of Opendatasoft entities, implemented as typing.NamedTuples, that many query evaluation methods return by default.

Dataset

ods_explore.models.Dataset(attachments, data_visible dataset_id, dataset_uid, features, fields, has_records, metas, visibility)

Record

ods_explore.models.Record(id, fields, size, timestamp)