将 SQL 查询限制为 Graphene-SQLAlchemy 中定义的字段/列
Posted
技术标签:
【中文标题】将 SQL 查询限制为 Graphene-SQLAlchemy 中定义的字段/列【英文标题】:Limiting SQL query to defined fields/columns in Graphene-SQLAlchemy 【发布时间】:2018-11-03 07:03:11 【问题描述】:这个问题已在https://github.com/graphql-python/graphene-sqlalchemy/issues/134 下发布为 GH 问题,但我想我也将其发布在这里以吸引 SO 人群。
可以在https://github.com/somada141/demo-graphql-sqlalchemy-falcon 下找到完整的工作演示。
考虑以下 SQLAlchemy ORM 类:
class Author(Base, OrmBaseMixin):
__tablename__ = "authors"
author_id = sqlalchemy.Column(
sqlalchemy.types.Integer(),
primary_key=True,
)
name_first = sqlalchemy.Column(
sqlalchemy.types.Unicode(length=80),
nullable=False,
)
name_last = sqlalchemy.Column(
sqlalchemy.types.Unicode(length=80),
nullable=False,
)
像这样简单地包裹在SQLAlchemyObjectType
中:
class TypeAuthor(SQLAlchemyObjectType):
class Meta:
model = Author
并通过以下方式暴露:
author = graphene.Field(
TypeAuthor,
author_id=graphene.Argument(type=graphene.Int, required=False),
name_first=graphene.Argument(type=graphene.String, required=False),
name_last=graphene.Argument(type=graphene.String, required=False),
)
@staticmethod
def resolve_author(
args,
info,
author_id: Union[int, None] = None,
name_first: Union[str, None] = None,
name_last: Union[str, None] = None,
):
query = TypeAuthor.get_query(info=info)
if author_id:
query = query.filter(Author.author_id == author_id)
if name_first:
query = query.filter(Author.name_first == name_first)
if name_last:
query = query.filter(Author.name_last == name_last)
author = query.first()
return author
一个 GraphQL 查询,例如:
query GetAuthor
author(authorId: 1)
nameFirst
将导致发出以下原始 SQL(取自 SQLA 引擎的回显日志):
SELECT authors.author_id AS authors_author_id, authors.name_first AS authors_name_first, authors.name_last AS authors_name_last
FROM authors
WHERE authors.author_id = ?
LIMIT ? OFFSET ?
2018-05-24 16:23:03,669 INFO sqlalchemy.engine.base.Engine (1, 1, 0)
正如我们所见,我们可能只需要nameFirst
字段,即name_first
列,但会获取整行。当然 GraphQL 响应只包含请求的字段,即,
"data":
"author":
"nameFirst": "Robert"
但我们仍然获取了整行,这在处理宽表时成为一个主要问题。
有没有办法自动向 SQLAlchemy 传达需要哪些列,以防止这种形式的过度获取?
【问题讨论】:
【参考方案1】:我的问题已在 GitHub 问题 (https://github.com/graphql-python/graphene-sqlalchemy/issues/134) 上得到解答。
这个想法是从info
参数(类型为graphql.execution.base.ResolveInfo
)中识别出请求的字段,该参数通过get_field_names
函数传递给解析器函数,如下所示:
def get_field_names(info):
"""
Parses a query info into a list of composite field names.
For example the following query:
carts
edges
node
id
name
...cartInfo
fragment cartInfo on CartType whatever
Will result in an array:
[
'carts',
'carts.edges',
'carts.edges.node',
'carts.edges.node.id',
'carts.edges.node.name',
'carts.edges.node.whatever'
]
"""
fragments = info.fragments
def iterate_field_names(prefix, field):
name = field.name.value
if isinstance(field, FragmentSpread):
_results = []
new_prefix = prefix
sub_selection = fragments[field.name.value].selection_set.selections
else:
_results = [prefix + name]
new_prefix = prefix + name + "."
if field.selection_set:
sub_selection = field.selection_set.selections
else:
sub_selection = []
for sub_field in sub_selection:
_results += iterate_field_names(new_prefix, sub_field)
return _results
results = iterate_field_names('', info.field_asts[0])
return results
以上函数取自https://github.com/graphql-python/graphene/issues/348#issuecomment-267717809。该问题包含此功能的其他版本,但我觉得这是最完整的。
并使用已识别的字段来限制 SQLAlchemy 查询中检索到的字段,如下所示:
fields = get_field_names(info=info)
query = TypeAuthor.get_query(info=info).options(load_only(*relation_fields))
当应用于上述示例查询时:
query GetAuthor
author(authorId: 1)
nameFirst
get_field_names
函数将返回 ['author', 'author.nameFirst']
。但是,由于“原始”SQLAlchemy ORM 字段是蛇形大小写的,因此需要更新 get_field_names
查询以删除 author
前缀并通过 graphene.utils.str_converters.to_snake_case
函数转换字段名。
长话短说,上述方法会产生如下原始 SQL 查询:
INFO:sqlalchemy.engine.base.Engine:SELECT authors.author_id AS authors_author_id, authors.name_first AS authors_name_first
FROM authors
WHERE authors.author_id = ?
LIMIT ? OFFSET ?
2018-06-09 13:22:16,396 INFO sqlalchemy.engine.base.Engine (1, 1, 0)
更新
如果有人来到这里想知道如何实现我已经开始实现我自己的get_query_fields
函数版本:
from typing import List, Dict, Union, Type
import graphql
from graphql.language.ast import FragmentSpread
from graphql.language.ast import Field
from graphene.utils.str_converters import to_snake_case
import sqlalchemy.orm
from demo.orm_base import OrmBaseMixin
def extract_requested_fields(
info: graphql.execution.base.ResolveInfo,
fields: List[Union[Field, FragmentSpread]],
do_convert_to_snake_case: bool = True,
) -> Dict:
"""Extracts the fields requested in a GraphQL query by processing the AST
and returns a nested dictionary representing the requested fields.
Note:
This function should support arbitrarily nested field structures
including fragments.
Example:
Consider the following query passed to a resolver and running this
function with the `ResolveInfo` object passed to the resolver.
>>> query = "query getAuthorauthor(authorId: 1)nameFirst, nameLast"
>>> extract_requested_fields(info, info.field_asts, True)
'author': 'name_first': None, 'name_last': None
Args:
info (graphql.execution.base.ResolveInfo): The GraphQL query info passed
to the resolver function.
fields (List[Union[Field, FragmentSpread]]): The list of `Field` or
`FragmentSpread` objects parsed out of the GraphQL query and stored
in the AST.
do_convert_to_snake_case (bool): Whether to convert the fields as they
appear in the GraphQL query (typically in camel-case) back to
snake-case (which is how they typically appear in ORM classes).
Returns:
Dict: The nested dictionary containing all the requested fields.
"""
result =
for field in fields:
# Set the `key` as the field name.
key = field.name.value
# Convert the key from camel-case to snake-case (if required).
if do_convert_to_snake_case:
key = to_snake_case(name=key)
# Initialize `val` to `None`. Fields without nested-fields under them
# will have a dictionary value of `None`.
val = None
# If the field is of type `Field` then extract the nested fields under
# the `selection_set` (if defined). These nested fields will be
# extracted recursively and placed in a dictionary under the field
# name in the `result` dictionary.
if isinstance(field, Field):
if (
hasattr(field, "selection_set") and
field.selection_set is not None
):
# Extract field names out of the field selections.
val = extract_requested_fields(
info=info,
fields=field.selection_set.selections,
)
result[key] = val
# If the field is of type `FragmentSpread` then retrieve the fragment
# from `info.fragments` and recursively extract the nested fields but
# as we don't want the name of the fragment appearing in the result
# dictionary (since it does not match anything in the ORM classes) the
# result will simply be result of the extraction.
elif isinstance(field, FragmentSpread):
# Retrieve referened fragment.
fragment = info.fragments[field.name.value]
# Extract field names out of the fragment selections.
val = extract_requested_fields(
info=info,
fields=fragment.selection_set.selections,
)
result = val
return result
它将 AST 解析为 dict
,保留查询的结构并(希望)匹配 ORM 的结构。
运行info
对象的查询,例如:
query getAuthor
author(authorId: 1)
nameFirst,
nameLast
生产
'author': 'name_first': None, 'name_last': None
而像这样的更复杂的查询:
query getAuthor
author(nameFirst: "Brandon")
...authorFields
books
...bookFields
fragment authorFields on TypeAuthor
nameFirst,
nameLast
fragment bookFields on TypeBook
title,
year
产生:
'author': 'books': 'title': None, 'year': None,
'name_first': None,
'name_last': None
现在,这些字典可用于定义主表上的字段(在本例中为 Author
),因为它们将具有 None
的值,例如 name_first
或关系上的字段该主表的字段,例如 books
关系上的字段 title
。
自动应用这些字段的简单方法可以采用以下函数的形式:
def apply_requested_fields(
info: graphql.execution.base.ResolveInfo,
query: sqlalchemy.orm.Query,
orm_class: Type[OrmBaseMixin]
) -> sqlalchemy.orm.Query:
"""Updates the SQLAlchemy Query object by limiting the loaded fields of the
table and its relationship to the ones explicitly requested in the GraphQL
query.
Note:
This function is fairly simplistic in that it assumes that (1) the
SQLAlchemy query only selects a single ORM class/table and that (2)
relationship fields are only one level deep, i.e., that requestd fields
are either table fields or fields of the table relationship, e.g., it
does not support fields of relationship relationships.
Args:
info (graphql.execution.base.ResolveInfo): The GraphQL query info passed
to the resolver function.
query (sqlalchemy.orm.Query): The SQLAlchemy Query object to be updated.
orm_class (Type[OrmBaseMixin]): The ORM class of the selected table.
Returns:
sqlalchemy.orm.Query: The updated SQLAlchemy Query object.
"""
# Extract the fields requested in the GraphQL query.
fields = extract_requested_fields(
info=info,
fields=info.field_asts,
do_convert_to_snake_case=True,
)
# We assume that the top level of the `fields` dictionary only contains a
# single key referring to the GraphQL resource being resolved.
tl_key = list(fields.keys())[0]
# We assume that any keys that have a value of `None` (as opposed to
# dictionaries) are fields of the primary table.
table_fields = [
key for key, val in fields[tl_key].items()
if val is None
]
# We assume that any keys that have a value being a dictionary are
# relationship attributes on the primary table with the keys in the
# dictionary being fields on that relationship. Thus we create a list of
# `[relatioship_name, relationship_fields]` lists to be used in the
# `joinedload` definitions.
relationship_fieldsets = [
[key, val.keys()]
for key, val in fields[tl_key].items()
if isinstance(val, dict)
]
# Assemble a list of `joinedload` definitions on the defined relationship
# attribute name and the requested fields on that relationship.
options_joinedloads = []
for relationship_fieldset in relationship_fieldsets:
relationship = relationship_fieldset[0]
rel_fields = relationship_fieldset[1]
options_joinedloads.append(
sqlalchemy.orm.joinedload(
getattr(orm_class, relationship)
).load_only(*rel_fields)
)
# Update the SQLAlchemy query by limiting the loaded fields on the primary
# table as well as by including the `joinedload` definitions.
query = query.options(
sqlalchemy.orm.load_only(*table_fields),
*options_joinedloads
)
return query
【讨论】:
这正是我想要的,感谢您发布它!我唯一的问题是 - 为什么这个输出在 info 对象中不可用,而不是需要如此复杂的自定义解决方案?这是一个真正的问题-您似乎经常希望确保从数据库中获取正确的数据(防止不必要/浪费的查询)以进行进一步处理-但我是 Graphene/GraphQL 的新手所以我想知道我是否只是错过了一些关于 Graphene/GraphQL 如何“意味着”工作的信息? @Ascendant 我认为问题不在于 Graphene/GraphQL。这些解决方案不关心您如何获取数据或获取的性能如何。问题出在 Graphene-SQLAlchemy 包中,主要关注的是将 SQLAlchemy 模式映射到 GraphQL 模式。在覆盖所有边缘情况的同时生成这样的查询将过于复杂(如果不是不可能的话),因此实际的获取落在了开发人员身上。我上面的解决方案不是万能的,但考虑到 SQLAlchemy 的复杂性,在许多情况下都无法实现?。 在 GraphQL 中,当您在数据库中查询多对多时,无论您使用什么 ORM(如果有),这不是一个更普遍的问题吗?如果您有a
和b
的多对多关系,并且您的查询是 a b
,那么您似乎想要连接a
和b
(按@ 分组987654354@) 在a
的解析器中...否则,a
中的b
的解析器将针对a
中的每一行访问数据库。但是a
的解析器无法知道如果没有您发布的解决方案,它需要加入 b
。
您将如何使用apply_requested_fields
函数?您能否提供一个将其应用于标准查询的示例,以及 .filter_by(...)
或 .one()
或 .all()
等其他子句?以上是关于将 SQL 查询限制为 Graphene-SQLAlchemy 中定义的字段/列的主要内容,如果未能解决你的问题,请参考以下文章
如何优化 sql 查询以避免在没有 php.ini 或设置时间限制的情况下执行最长时间 [关闭]