如何选择包含特定单词的 postgreSQL 行

Posted 2023-03-23

技术标签:

【中文标题】如何选择包含特定单词的 postgreSQL 行【英文标题】：How to select a postgreSQL row that contains a specific word 【发布时间】：2019-10-13 17:14:18 【问题描述】：

尝试基于关键字为 postgreSQl 数据库构建查询。 LIKE 不起作用，因为它匹配包含任何字母的任何行。例如：

SELECT * FROM table WHERE column ilike '%jeep%';

这将返回列中包含 j、e 或 p 的任何行（并且由于某种原因，同一行多次出现）。不是“吉普车”这个词。

以下是我的查询结构。使用 Knex 并对多个表进行排队：

searchAllBoardPosts(db, term) 
        return db
            .select('*')
            .from(
                a: 'messageboard_posts',
                b: 'rentals',
                c: 'market_place',
                d: 'jobs'
            )
            .where('a.title', 'ilike', `%$term%`)
            .orWhere('b.title', 'ilike', `%$term%`)
            .orWhere('c.title', 'ilike', `%$term%`)
            .orWhere('d.title', 'ilike', `%$term%`);
    ,

提前致谢！

更新：这是 SQL 输出：

select * 
from "messageboard_posts" as "a", 
"rentals" as "b",
"market_place" as "c", 
"jobs" as "d" 
where "a"."title" ilike '%jeep%'
or "b"."title" ilike '%jeep%' 
or "c"."title" ilike '%jeep%' 
or "d"."title" ilike '%jeep%'

【问题讨论】：

我假设 .from( all of those ) 是一个隐式连接，而不是像你想要的那样的联合 - 即 ILIKEs 工作正常并且你发现了一个不存在的模式重复的结果。但是，找不到它的记录位置。尝试记录它创建的 SQL？所以你是说SELECT * FROM table WHERE column ilike ‘jeep’ 将匹配 column = “elephant” 的行（因为字符串中有一个“e”）？我觉得这很难相信——您可以用一些查询输出更新您的问题，还是在您提供的SELECT 查询示例之后的段落中重新表述您的陈述？ @Ry- 根据我对 Knex 的理解（对它来说是新手），这种结构是在一次查询多个表时将表别名为列的一种。我确实注销了 SQL 结果，它返回了 800 多个对象，但所有表中总共只有 25 个对象。结果按照应有的方式呈现在我的客户端中，但有数百个相同的帖子。 @richyen 是的，我知道！很难相信，但这就是我假设的结论。它返回的每一行至少包含一个来自所需列中的关键字的 leet。我会尝试在段落中添加更多信息。不是 SQL 结果，是 SQL。它创建的查询。如果它看起来像 FROM messageboard_posts a, rentals b, market_place c, jobs d，那么您正在执行隐式连接。 github.com/knex/knex/issues/2378 【参考方案1】：

这个查询是一个交叉连接

（但 Knex 语法掩盖了一点）。

这将返回列中包含 j、e 或 p 的任何行（以及由于某种原因多次出现同一行）。

它不会多次返回同一行。它返回以CROSS JOIN 命名的每个表中的所有内容。这是在 FROM 子句中命名多个表时 Postgres 的行为（请参阅：docs）。这个：

db
  .select('*')
  .from(
    a: 'table_one',
    b: 'table_two'
  )

将在每次获得ILIKE 匹配时从命名表的每个中返回整行。所以至少你总是会得到一个包含两行连接的对象（或者你在FROM子句中命名的多行）。

棘手的部分是，Knex 列名必须映射到 javascript 对象。这意味着，如果有两个名为 id 或 title 的列结果，则最后一个将覆盖结果对象中的第一个。

让我们举例说明（用袋熊）

这是一个迁移和种子，只是为了更清楚：

table_one

exports.up = knex =>
  knex.schema.createTable("table_one", t => 
    t.increments("id");
    t.string("title");
  );

exports.down = knex => knex.schema.dropTable("table_one");

table_two

exports.up = knex =>
  knex.schema.createTable("table_two", t => 
    t.increments("id");
    t.string("title");
  );

exports.down = knex => knex.schema.dropTable("table_two");

种子

exports.seed = knex =>
    knex("table_one")
      .del()
      .then(() => knex("table_two").del())
      .then(() =>
        knex("table_one").insert([
           title: "WILLMATCHwombatblahblahblah" ,
           title: "WILLMATCHWOMBAT" 
        ])
      )
      .then(() =>
        knex("table_two").insert([
           title: "NEVERMATCHwwwwwww" ,
           title: "wombatWILLMATCH" 
        ])
      )
  );

查询

这让我们可以在ILIKE 匹配方面进行一些尝试。现在我们需要明确列名：

  return db
    .select([
      "a.id as a.id",
      "a.title as a.title",
      "b.id as b.id",
      "b.title as b.title"
    ])
    .from(
      a: "table_one",
      b: "table_two"
    )
    .where("a.title", "ilike", `%$term%`)
    .orWhere("b.title", "ilike", `%$term%`);

这会产生：

[
  
    'a.id': 1,
    'a.title': 'WILLMATCHwombatblahblahblah',
    'b.id': 1,
    'b.title': 'NEVERMATCHwwwwwww'
  ,
  
    'a.id': 1,
    'a.title': 'WILLMATCHwombatblahblahblah',
    'b.id': 2,
    'b.title': 'wombatWILLMATCH'
  ,
  
    'a.id': 2,
    'a.title': 'WILLMATCHWOMBAT',
    'b.id': 1,
    'b.title': 'NEVERMATCHwwwwwww'
  ,
  
    'a.id': 2,
    'a.title': 'WILLMATCHWOMBAT',
    'b.id': 2,
    'b.title': 'wombatWILLMATCH'
  
]

如您所见，它交叉连接了两个表，但我怀疑您只看到似乎不匹配的结果（因为匹配在另一个表中，而 title列名重复）。

那么，查询应该是什么？

我认为您（或 Ry 的）使用 UNION 的计划是正确的，但可能值得使用 UNION ALL 以避免不必要地删除重复项。像这样的：

  return db
    .unionAll([
      db("market_place")
        .select(db.raw("*, 'marketplace' as type"))
        .where("title", "ilike", `%$term%`),
      db("messageboard_posts")
        .select(db.raw("*, 'post' as type"))
        .where("title", "ilike", `%$term%`),
      db("rentals")
        .select(db.raw("*, 'rental' as type"))
        .where("title", "ilike", `%$term%`),
      db("jobs")
        .select(db.raw("*, 'job' as type"))
        .where("title", "ilike", `%$term%`)
    ]);

对我们的测试数据进行类似的查询会产生结果集：

[
   id: 1, title: 'WILLMATCHwombatblahblahblah', type: 'table_one' ,
   id: 2, title: 'WILLMATCHWOMBAT', type: 'table_one' ,
   id: 2, title: 'wombatWILLMATCH', type: 'table_two' 
]

【讨论】：

【参考方案2】：

使用.union 可以工作并返回正确的值，但是使用查询中第一个表中的键标记。最后只进行了四个单独的查询，但希望这可以帮助其他人！

searchAllBoardPosts(db, term) 
        return db
            .union([db
                    .select('id', 'market_place_cat')
                    .from('market_place')
                    .where('title', 'ilike', `%$term%`)
            ])
            .union([db
                    .select('id', 'board_id')
                    .from('messageboard_posts')
                    .where('title', 'ilike', `%$term%`)
            ])
            .union([db
                    .select('id', 'rental_cat')
                    .from('rentals')
                    .where('title', 'ilike', `%$term%`)
            ])
            .union([db
                    .select('id', 'job_cat')
                    .from('jobs')
                    .where('title', 'ilike', `%$term%`)
            ]);
    ,

【讨论】：

【参考方案3】：

这个表达式：

 WHERE column ilike 'jeep'

只匹配值为lower(column) = 'jeep'的行，如：

吉普车吉普车吉普车

它不匹配任何其他表达式。

如果您使用通配符：

 WHERE column ilike '%jeep%'

然后在lower(column) 的任何地方寻找'jeep'。不是逐个字符搜索。为此，您将使用正则表达式和字符类：

WHERE column ~* '[jep]'

如果你想在字段中找到一个单词，你通常会使用正则表达式，而不是like/ilike。

【讨论】：

这就是我的想法，但似乎并非如此。在定位一个表时，我在其他查询中使用了 ilike，它似乎工作正常。但是当定位多个表时，就没有那么多了。 @RyanCarville 。 . .它的行为不像正则表达式匹配器，除非 knex 或中间层中的某些内容正在更改您的代码。 @RyanCarville 。 . .至于您的查询，您需要在表格中使用JOIN 条件。

以上是关于如何选择包含特定单词的 postgreSQL 行的主要内容，如果未能解决你的问题，请参考以下文章