Last active
November 25, 2020 17:31
-
-
Save erichiggins/1b8e34a1a9816245192c to your computer and use it in GitHub Desktop.
Efficiently page over a Query to fetch all entities from the Google App Engine Datastore.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
""" | |
Functions are provided for both the DB and NDB Datastore APIs. | |
References: | |
* https://cloud.google.com/appengine/docs/python/datastore/queries | |
* https://cloud.google.com/appengine/docs/python/ndb/queries | |
""" | |
def db_fetch_all(query, limit=100, cursor=None): | |
"""Fetch all function for the DB Datastore API.""" | |
results = [] | |
more = True | |
if cursor: | |
query = query.with_cursor(cursor) | |
# Fetch entities in batches. | |
while more: | |
entities = query.fetch(limit) | |
results.extend(entities) | |
query = query.with_cursor(query.cursor()) | |
more = bool(entities) | |
return results | |
def ndb_fetch_all(query, limit=100, cursor=None): | |
"""Fetch all function for the NDB Datastore API.""" | |
results = [] | |
more = True | |
# Fetch entities in batches. | |
while more: | |
entities, cursor, more = query.fetch_page(limit, start_cursor=cursor) | |
results.extend(entities) | |
return results |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Based on some more through testing and research:
.fetch_page()
is best-suited for user-facing pagination w/ cursors.fetch()
is best-suited when the number of results is known to be under 2000 or so.iter()
with abatch_size
of 200 or appears to be the fastest way to iterate over a query with an unknown number of resultsSample performance data from a basic query with no filters and 2100 entities:
fetch_page
2.044650s
3.684160s
4.055870s
4.400940s
4.795300s
4.839350s
11.897800s
12.700310s
3.950250s
3.813200s
4.106560s
3.774050s
4.628290s
[x for x in query]
1.222110s
6.526030s
10.406020s
9.368520s
2.529480s
2.404470s
3.003570s
3.604640s
3.675460s
3.448450s
3.297120s
3.561790s
3.327240s
3.144370s
ndb.tasklet
andquery.iter()
2.166430s
2.114100s
list(query)
1.627440s
2.659040s
list(query)
withbatch_size=100
1.344340s
2.518590s
3.010650s
fetch()
0.663910s
1.000210s
1.211190s
Relevant discussions: