浏览哈希数组以删除Ruby中其他字符串中包含的字符串

Posted 2021-05-05

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了浏览哈希数组以删除Ruby中其他字符串中包含的字符串相关的知识，希望对你有一定的参考价值。

我有一个像这样构建的哈希数组：

grapes_matched << { part: part, grape: grape_match }

我想要：

保持当前的数组排序
如果grape_match.name（Active Record项）包含另一个grape_match.name更短，则删除数组中的项目。

例如，假设我的哈希数组是：

{ part:"toto", grape: AR Grape with name: "Cabernet" }
{ part:"titi", grape: AR Grape with name: "Cabernet Sauvignon" }
{ part:"tutu", grape: AR Grape with name: "Merlot" }

由于第二个“Cabernet Sauvignon”包括第一个“Cabernet”，我想删除第一个阵列。

如果可能的话，我想不构建另一个数组并保留我的哈希数组而不改变结构（不像下面的代码）。

我当时有一些非常丑陋的东西：

grapes_matched.each do |grape_matched|
            temp_grape = grape_matched[:grape]
            temp_grape_name = I18n.transliterate(temp_grape.name).downcase 
            # does the temp grape name is included in one of previous grapes
            # first grape
            grapes_founds << temp_grape if grapes_founds.length == 0
            # other grapes
            grapes_founds.each do |grape_found|
              grapes_founds << temp_grape if !I18n.transliterate(grape_found.name).downcase.include? temp_grape_name
            end
          end

我很确定这可以通过Ruby中较少的代码行来完成，并保留最初的哈希数组。

提前致谢。

答案

我的目标是实现一个合理有效的算法。

让我们首先简化并重新排列数组。

grapes = [{ part:"toto", grape: "Cabernet" },
          { part:"tutu", grape: "Merlot" },
          { part:"titi", grape: "Cabernet Sauvignon" }]

然后，我们可以以合理有效的方式执行以下操作以获得所需的阵列。

grapes.each_with_index.
       sort_by { |g,_i| -g[:grape].size }.
       each_with_object([]) { |(g,i),a| a << [g,i] unless a.any? { |f,_i|
         f[:grape].include?(g[:grape]) } }.
       sort_by(&:last).
       map(&:first)
  #=> [{:part=>"tutu", :grape=>"Merlot"},
  #    {:part=>"titi", :grape=>"Cabernet Sauvignon"}]

步骤如下。

为每个哈希添加索引，以便稍后可以确定它们在grapes中的原始顺序。

e = grapes.each_with_index
  #=> #<Enumerator: [{:part=>"toto", :grape=>"Cabernet"},
  #                  {:part=>"tutu", :grape=>"Merlot"},
  #                  {:part=>"titi", :grape=>"Cabernet Sauvignon"}]:each_with_index>

通过减小g[:grape]的大小来对哈希/索引对进行排序。

 b = e.sort_by { |g,_i| -g[:grape].size }
   #=> [[{:part=>"titi", :grape=>"Cabernet Sauvignon"}, 2],
   #    [{:part=>"toto", :grape=>"Cabernet"}, 0],
   #    [{:part=>"tutu", :grape=>"Merlot"}, 1]]

将每个散列/索引对[g,i]添加到最初为空的数组a，除非f[:grape]包含g[:grape]已经在f中的散列a。

c = b.each_with_object([]) { |(g,i),a| a << [g,i] unless a.any? { |f,_i|
         f[:grape].include?(g[:grape]) } }
  #=> [[{:part=>"titi", :grape=>"Cabernet Sauvignon"}, 2],
  #    [{:part=>"tutu", :grape=>"Merlot"}, 1]]

要在c中获得所需的散列顺序，请按原始数组grapes中的索引对它们进行排序（这对此示例没有影响）。

d = c.sort_by(&:last)
  #=> [[{:part=>"tutu", :grape=>"Merlot"}, 1],
  #    [{:part=>"titi", :grape=>"Cabernet Sauvignon"}, 2]]

删除索引。

d.map(&:first)
  #=> [{:part=>"tutu", :grape=>"Merlot"},
  #    {:part=>"titi", :grape=>"Cabernet Sauvignon"}]

根据要求，可能最好用f[:grape].include?(g[:grape])替换f[:grape].begin_with?(g[:grape]) || f[:grape].end_with?(g[:grape])。

下面是将@Max'解决方案与我的解决方案进

def max_way(grapes_matched)
  grapes_matched.select do |grape_matched|
    grapes_matched.none? { |gm| gm[:grape] != grape_matched[:grape] && 
      grape_matched[:grape].include?(gm[:grape]) }
  end
end

def cary_way(grapes)
  grapes.each_with_index.
         sort_by { |g,_i| -g[:grape].size }.
         each_with_object([]) { |(g,i),a| a << [g,i] unless a.any? { |f,_i|
           f[:grape].include?(g[:grape]) } }.
         sort_by(&:last).
         map(&:first)
end

ALPHA = ('a'..'z').to_a
def rnd5
  '     '.gsub(' ') { ALPHA.sample }
end

def grapes(n, m)
  n.times.each_with_object([]) do |i,a|
    s1, s2 = rnd5, rnd5
    a << { grape: "%s %s" % [s1, s2] }
    a << { grape: i.even? ? s1 : s2 } if i < m
  end.shuffle
end

require 'fruity'

def bench(n, m)
  (grapes_matched = grapes(n, m)).size
  compare do
    Max  { max_way(grapes_matched) }
    Cary { cary_way(grapes_matched) }
  end
end

bench   95, 5
Running each test once. Test will take about 1 second.
Cary is faster than Max by 3x ± 1.0

bench  950, 50
Running each test once. Test will take about 13 seconds.
Cary is faster than Max by 3x ± 1.0

bench  950, 500
Running each test once. Test will take about 23 seconds.
Cary is faster than Max by 4x ± 0.1

另一答案

它可以更短：

grapes_founds = grapes_matched.select do |grape_matched|
  grapes_matched.none? { |gm| gm[:grape] != grape_matched[:grape] && grape_matched[:grape].include?(gm[:grape]) }
end

英文：选择所有葡萄，其中没有其他葡萄具有不同的名称，包括在这个葡萄的名称中。

我不完全清楚你的数据结构是什么以及你的字符串是如何标准化的，所以你可能需要按照正确的形式按摩它。

以上是关于浏览哈希数组以删除Ruby中其他字符串中包含的字符串的主要内容，如果未能解决你的问题，请参考以下文章