将 perl 正则表达式中的键和值传递给哈希

Posted

技术标签:

【中文标题】将 perl 正则表达式中的键和值传递给哈希【英文标题】:Passing keys and values from perl regular expression to hash 【发布时间】:2021-12-26 02:54:45 【问题描述】:

你能告诉我如何在 perl 中将捕获组的内容塞进哈希中吗?

例子:

我有一个文件:

https://www.youtube.com/watch?v=5qap5aO4i9A
http://example.com:8080/r/p?s=10&z=11#text
https://exapmle.com/test/p?var=100
http://test.org:81/
https://main.org
gopher://gopher.floodgap.com/gopher/relevance.txt
file:///home/user/.profile
gemini://transjovian.org/

我想把这个文件的每一行分解成一组键-值并将它们添加到散列中,然后输出该散列的内容。

我的脚本内容:

#!/usr/bin/env perl

use strict;
use utf8;
use warnings;
use feature qw(say);
use Data::Dumper;

sub parse_url 
    my ($url) = @_;
    if ($url =~ m#(.*):/(.*)#) 
        my (%hash, $scheme, $domain, $port, $path, $query_string, $anchor);
        $url =~ m!^(?<scheme>[^:]+):/2,3(?<domain>[^:/]+)(?::(?<port>(?:\d+)?)?)(?<path>(?:/[^?]+)?)(?:\?(?<query_string>(?:[^\#]+)?)?)(?:\#(?<anchor>(?:.+)?)?)!;

        if(defined($scheme))  $hash'scheme' = $scheme; 
        if(defined($domain))  $hash'domain' = $domain; 
        if(defined($port))  $hash'port' = $port; 
        if(defined($path))  $hash'path' = $path; 
        if(defined($query_string))  $hash'query_string' = $query_string; 
        if(defined($anchor))  $hash'anchor' = $anchor; 
        return %hash;
    


while (my $row = <>) 
    chomp $row;
    say $row;
    my %hash = parse_url($row);
    print Dumper \%hash;

我想得到这个输出:

https://www.youtube.com/watch?v=5qap5aO4i9A
$VAR1 = 
    scheme         => 'https',
    domain         => 'www.youtube.com',
    path           => '/watch',
    query_string   => 'v=5qap5aO4i9A',
;
http://example.com:8080/r/p?s=10&z=11#text
$VAR1 = 
    scheme         => 'http',
    domain         => 'example.com',
    port           => '8080',
    path           => '/r/p',
    query_string   => 's=10&z=11',
    anchor         => 'text',
;
https://exapmle.com/test/p?var=100
$VAR1 = 
    scheme         => 'http',
    domain         => 'example.com',
    path           => '/test/p',
    query_string   => 'var=100',
;
http://test.org:81/
$VAR1 = 
    scheme         => 'http',
    domain         => 'test.org',
    port           => '81',
;
https://main.org
$VAR1 = 
    scheme         => 'https',
    domain         => 'main.org',
;
gopher://gopher.floodgap.com/gopher/relevance.txt
$VAR1 = 
    scheme         => 'gopher',
    domain         => 'gopher.floodgap.com',
    path           => '/gopher/relevance.txt',
;
file:///home/user/.profile
$VAR1 = 
    scheme         => 'file',
    path           => '/home/user/.profile',
;
gemini://transjovian.org/
$VAR1 = 
    scheme         => 'gemini',
    domain         => 'transjovian.org',
;

但我得到了这个结论:

https://www.youtube.com/watch?v=5qap5aO4i9A
$VAR1 = ;
http://example.com:8080/r/p?s=10&z=11#text
$VAR1 = ;
https://exapmle.com/test/p?var=100
$VAR1 = ;
http://test.org:81/
$VAR1 = ;
https://main.org
$VAR1 = ;
gopher://gopher.floodgap.com/gopher/relevance.txt
$VAR1 = ;
file:///home/user/.profile
$VAR1 = ;
gemini://transjovian.org/
$VAR1 = ;

感谢您的帮助!

【问题讨论】:

为了解析 URL,使用标准的 URI 模块几乎总是更容易和更健壮。 metacpan.org/dist/URI 【参考方案1】:

您可以使用特殊变量%+(或%^CAPTURE)来获取命名捕获,如下所示:

use strict;
use utf8;
use warnings;
use feature qw(say);
use open ':std', ':encoding(utf-8)';
use Data::Dumper;

sub parse_url 
    my ($url) = @_;
    if ($url =~ m#(.*):/(.*)#) 
        $url =~ m!
           ^(?<scheme>[^:]+):/2,3
            (?<domain>[^:/]+)
              (?::?(?<port>(?:\d+)?)?)
              (?<path>(?:/[^?]+)?)
              (?:\??(?<query_string>(?:[^\#]+)?)?)
              (?:\#?(?<anchor>(?:.+)?)?)
        !x;
        my %hash = %+;
        return %hash;
    


while (my $row = <>) 
    chomp $row;
    say $row;
    my %hash = parse_url($row);
    if (%hash) 
        print Dumper \%hash;
    
    else 
        say "  -> No match";
    

输出

 $VAR1 = 
          'anchor' => 'text',
          'path' => '/r/p',
          'query_string' => 's=10&z=11',
          'port' => '8080',
          'scheme' => 'http',
          'domain' => 'example.com'
        ;
https://www.youtube.com/watch?v=5qap5aO4i9A
$VAR1 = 
          'scheme' => 'https',
          'domain' => 'www.youtube.com',
          'port' => '',
          'anchor' => '',
          'query_string' => 'v=5qap5aO4i9A',
          'path' => '/watch'
        ;
https://exapmle.com/test/p?var=100
$VAR1 = 
          'port' => '',
          'scheme' => 'https',
          'anchor' => '',
          'path' => '/test/p',
          'domain' => 'exapmle.com',
          'query_string' => 'var=100'
        ;
http://test.org:81/
$VAR1 = 
          'scheme' => 'http',
          'domain' => 'test.org',
          'port' => '81',
          'anchor' => '',
          'query_string' => '/',
          'path' => ''
        ;
https://main.org
$VAR1 = 
          'port' => '',
          'scheme' => 'https',
          'domain' => 'main.org',
          'anchor' => '',
          'path' => '',
          'query_string' => ''
        ;
gopher://gopher.floodgap.com/gopher/relevance.txt
$VAR1 = 
          'domain' => 'gopher.floodgap.com',
          'scheme' => 'gopher',
          'port' => '',
          'query_string' => '',
          'path' => '/gopher/relevance.txt',
          'anchor' => ''
        ;
file:///home/user/.profile
$VAR1 = 
          'port' => '',
          'scheme' => 'file',
          'domain' => 'home',
          'anchor' => '',
          'path' => '/user/.profile',
          'query_string' => ''
        ;
gemini://transjovian.org/
$VAR1 = 
          'domain' => 'transjovian.org',
          'scheme' => 'gemini',
          'port' => '',
          'query_string' => '/',
          'path' => '',
          'anchor' => ''
        ;

【讨论】:

以上是关于将 perl 正则表达式中的键和值传递给哈希的主要内容,如果未能解决你的问题,请参考以下文章

perl返回哈希和的键和值

将字典的键和值并行传递给函数?

如何在iOS swift的tableview中使用json响应中的键和值?

Perl正则表达式哈希匹配字符串

如何交换散列中的键和值

插入带位置的有序哈希表