使用 C# 从 Google Chrome 获取当前标签页的 URL

Posted

技术标签:

【中文标题】使用 C# 从 Google Chrome 获取当前标签页的 URL【英文标题】:Getting the current tab's URL from Google Chrome using C# 【发布时间】:2013-09-24 16:17:45 【问题描述】:

以前有一种方法可以从谷歌浏览器获取活动标签的 URL,方法是使用 FindWindowExSendMessage 调用来获取当前在多功能框中的文本。最近的(?)更新似乎打破了这种方法,因为 Chrome 似乎现在正在渲染所有内容。 (您可以使用 Spy++、AHK Window Spy 或 Window Detective 进行检查)

要在 Firefox 和 Opera 上获取当前 URL,您可以使用 DDE 和 WWW_GetWindowInfo。这在 Chrome 上似乎是不可能的(现在?)。

This question 有一个答案,其中包含有关它过去如何工作的更多信息,即这段代码(正如我所解释的,它不再工作了 - hAddressBox0):

var hAddressBox = FindWindowEx(
    intPtr,
    IntPtr.Zero,
    "Chrome_OmniboxView",
    IntPtr.Zero);

var sb = new StringBuilder(256);
SendMessage(hAddressBox, 0x000D, (IntPtr)256, sb);
temp = sb.ToString();

所以我的问题是:是否有一种 方法来获取当前焦点标签的 URL? (只是标题是不够的)

【问题讨论】:

试试看this SO question。 【参考方案1】:

编辑:似乎我的答案中的代码不再适用于更高版本的 Chrome 版本(尽管使用 AutomationElement 的想法仍然有效),因此请查看其他答案不同的版本。例如,这是 Chrome 54 的一个:https://***.com/a/40638519/377618

以下代码似乎可以工作,(感谢icemanind 的评论)但是资源密集型。找到 elmUrlBar 大约需要 350 毫秒……有点慢。

更不用说我们遇到了同时运行多个chrome 进程的问题。

// there are always multiple chrome processes, so we have to loop through all of them to find the
// process with a Window Handle and an automation element of name "Address and search bar"
Process[] procsChrome = Process.GetProcessesByName("chrome");
foreach (Process chrome in procsChrome) 
  // the chrome process must have a window
  if (chrome.MainWindowHandle == IntPtr.Zero) 
    continue;
  

  // find the automation element
  AutomationElement elm = AutomationElement.FromHandle(chrome.MainWindowHandle);
  AutomationElement elmUrlBar = elm.FindFirst(TreeScope.Descendants,
    new PropertyCondition(AutomationElement.NameProperty, "Address and search bar"));

  // if it can be found, get the value from the URL bar
  if (elmUrlBar != null) 
    AutomationPattern[] patterns = elmUrlBar.GetSupportedPatterns();
    if (patterns.Length > 0) 
      ValuePattern val = (ValuePattern)elmUrlBar.GetCurrentPattern(patterns[0]);
      Console.WriteLine("Chrome URL found: " + val.Current.Value);
    
  

编辑:我对上面的慢速方法不满意,所以我加快了速度(现在是 50 毫秒)并添加了一些 URL 验证以确保我们得到正确的 URL 而不是用户的东西可能正在网上搜索,或者仍在忙于输入 URL。代码如下:

// there are always multiple chrome processes, so we have to loop through all of them to find the
// process with a Window Handle and an automation element of name "Address and search bar"
Process[] procsChrome = Process.GetProcessesByName("chrome");
foreach (Process chrome in procsChrome) 
  // the chrome process must have a window
  if (chrome.MainWindowHandle == IntPtr.Zero) 
    continue;
  

  // find the automation element
  AutomationElement elm = AutomationElement.FromHandle(chrome.MainWindowHandle);

  // manually walk through the tree, searching using TreeScope.Descendants is too slow (even if it's more reliable)
  AutomationElement elmUrlBar = null;
  try 
    // walking path found using inspect.exe (Windows SDK) for Chrome 31.0.1650.63 m (currently the latest stable)
    var elm1 = elm.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, "Google Chrome"));
    if (elm1 == null)  continue;  // not the right chrome.exe
    // here, you can optionally check if Incognito is enabled:
    //bool bIncognito = TreeWalker.RawViewWalker.GetFirstChild(TreeWalker.RawViewWalker.GetFirstChild(elm1)) != null;
    var elm2 = TreeWalker.RawViewWalker.GetLastChild(elm1); // I don't know a Condition for this for finding :(
    var elm3 = elm2.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, ""));
    var elm4 = elm3.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.ToolBar));
    elmUrlBar = elm4.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.Custom));
   catch 
    // Chrome has probably changed something, and above walking needs to be modified. :(
    // put an assertion here or something to make sure you don't miss it
    continue;
  

  // make sure it's valid
  if (elmUrlBar == null) 
    // it's not..
    continue;
  

  // elmUrlBar is now the URL bar element. we have to make sure that it's out of keyboard focus if we want to get a valid URL
  if ((bool)elmUrlBar.GetCurrentPropertyValue(AutomationElement.HasKeyboardFocusProperty)) 
    continue;
  

  // there might not be a valid pattern to use, so we have to make sure we have one
  AutomationPattern[] patterns = elmUrlBar.GetSupportedPatterns();
  if (patterns.Length == 1) 
    string ret = "";
    try 
      ret = ((ValuePattern)elmUrlBar.GetCurrentPattern(patterns[0])).Current.Value;
     catch  
    if (ret != "") 
      // must match a domain name (and possibly "https://" in front)
      if (Regex.IsMatch(ret, @"^(https:\/\/)?[a-zA-Z0-9\-\.]+(\.[a-zA-Z]2,4).*$")) 
        // prepend http:// to the url, because Chrome hides it if it's not SSL
        if (!ret.StartsWith("http")) 
          ret = "http://" + ret;
        
        Console.WriteLine("Open Chrome URL found: '" + ret + "'");
      
    
    continue;
  

【讨论】:

嗨,我正在使用您上面的代码,但无法弄清楚 AutomationPattern[] 和 TreeScope 是什么。我无法编译是否需要添加一些 dll 才能使用它们/ 添加using System.Windows.Automation; 这里有一个提示。在 Visual Studio 2010 中,您可以将光标放在未知标识符上,然后按键盘上的 Ctrl+Dot。它会给你一个清单,你可以做些什么来使它工作。无论如何,您可能需要在项目引用中包含 UIAutomationTypes.dll 将以下两个引用添加到您的解决方案 1.UIAutomationClient、2.UIAutomationTypes。你不会得到任何错误.. 嗨,这对我来说过去工作得很好,但最近,随着 Chrome 升级到版本 34.0.1847.116m,它已经停止工作,因为谷歌已经改变了一些东西。有人可以建议任何可用于查找句柄和属性等的修复程序和/或工具,以便修复?【参考方案2】:

从 Chrome 54 开始,以下代码适用于我:

public static string GetActiveTabUrl()

  Process[] procsChrome = Process.GetProcessesByName("chrome");

  if (procsChrome.Length <= 0)
    return null;

  foreach (Process proc in procsChrome)
  
    // the chrome process must have a window 
    if (proc.MainWindowHandle == IntPtr.Zero)
      continue;

    // to find the tabs we first need to locate something reliable - the 'New Tab' button 
    AutomationElement root = AutomationElement.FromHandle(proc.MainWindowHandle);
    var SearchBar = root.FindFirst(TreeScope.Descendants, new PropertyCondition(AutomationElement.NameProperty, "Address and search bar"));
    if (SearchBar != null)
      return (string)SearchBar.GetCurrentPropertyValue(ValuePatternIdentifiers.ValueProperty);
  

  return null;

【讨论】:

您在 Chrome 55 上试过这个吗?那里似乎行不通。越来越难获得这一点的方式似乎指向故意删除对 URL 的所有访问权限 有趣。我想知道这是否与我在 Mac 上运行 Parallels 的事实有关。 searchBar 在所有情况下都为我返回 null。 @JonLimjap:我现在无法使用 Mac,也没有在那个平台上测试过。 Chrome 级别(不太可能)或 Parallel 对自动化库的支持可能存在差异。您是否使用此方法成功使用了任何其他版本的 Chrome? 在 59.0.3071.86 及其以前的版本上验证 :) 但它需要将近 1013 毫秒。 一旦我们有了这个,firefox 和 chrome 的 cpu 将达到 80-90%【参考方案3】:

对于 Chrome V53 及更高版本的我来说,上述所有方法都失败了。

这是有效的:

Process[] procsChrome = Process.GetProcessesByName("chrome");
foreach (Process chrome in procsChrome)

    if (chrome.MainWindowHandle == IntPtr.Zero)
        continue;

    AutomationElement element = AutomationElement.FromHandle(chrome.MainWindowHandle);
    if (element == null)
        return null;
    Condition conditions = new AndCondition(
        new PropertyCondition(AutomationElement.ProcessIdProperty, chrome.Id),
        new PropertyCondition(AutomationElement.IsControlElementProperty, true),
        new PropertyCondition(AutomationElement.IsContentElementProperty, true),
        new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.Edit));

    AutomationElement elementx = element.FindFirst(TreeScope.Descendants, conditions);
    return ((ValuePattern)elementx.GetCurrentPattern(ValuePattern.Pattern)).Current.Value as string;

在这里找到它:

https://social.msdn.microsoft.com/Forums/vstudio/en-US/93001bf5-440b-4a3a-ad6c-478a4f618e32/how-can-i-get-urls-of-open-pages-from-chrome-and-firefox?forum=csharpgeneral

【讨论】:

【参考方案4】:

我得到了 Chrome 38.0.2125.10 的结果以及下一个代码(代码 'try' 块内部必须用这个替换)

var elm1 = elm.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, "Google Chrome"));
if (elm1 == null)  continue;   // not the right chrome.exe
var elm2 = TreeWalker.RawViewWalker.GetLastChild(elm1);
var elm3 = elm2.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.HelpTextProperty, "TopContainerView"));
var elm4 = elm3.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.ToolBar));
var elm5 = elm4.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.HelpTextProperty, "LocationBarView"));
elmUrlBar = elm5.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.Edit));

【讨论】:

因为 chrome 45 (atleast) 这在 elm3/elm4 上出错了仍在试图找出原因【参考方案5】:

我采用了 Angelo 的解决方案并对其进行了一些清理...我对 LINQ 很感兴趣 :)

这是主要的方法;它使用了几个扩展方法:

public IEnumerable<string> GetTabs()

  // there are always multiple chrome processes, so we have to loop through all of them to find the
  // process with a Window Handle and an automation element of name "Address and search bar"
  var processes = Process.GetProcessesByName("chrome");
  var automationElements = from chrome in processes
                           where chrome.MainWindowHandle != IntPtr.Zero
                           select AutomationElement.FromHandle(chrome.MainWindowHandle);

  return from element in automationElements
         select element.GetUrlBar()
         into elmUrlBar
         where elmUrlBar != null
         where !((bool) elmUrlBar.GetCurrentPropertyValue(AutomationElement.HasKeyboardFocusProperty))
         let patterns = elmUrlBar.GetSupportedPatterns()
         where patterns.Length == 1
         select elmUrlBar.TryGetValue(patterns)
         into ret
         where ret != ""
         where Regex.IsMatch(ret, @"^(https:\/\/)?[a-zA-Z0-9\-\.]+(\.[a-zA-Z]2,4).*$")
         select ret.StartsWith("http") ? ret : "http://" + ret;

请注意,该评论具有误导性,因为 cmets 往往是 - 它实际上并不查看单个 AutomationElement。我把它放在那里是因为 Angelo 的代码有它。

这是扩展类:

public static class AutomationElementExtensions

  public static AutomationElement GetUrlBar(this AutomationElement element)
  
    try
    
      return InternalGetUrlBar(element);
    
    catch
    
      // Chrome has probably changed something, and above walking needs to be modified. :(
      // put an assertion here or something to make sure you don't miss it
      return null;
    
  

  public static string TryGetValue(this AutomationElement urlBar, AutomationPattern[] patterns)
  
    try
    
      return ((ValuePattern) urlBar.GetCurrentPattern(patterns[0])).Current.Value;
    
    catch
    
      return "";
    
  

  //

  private static AutomationElement InternalGetUrlBar(AutomationElement element)
  
    // walking path found using inspect.exe (Windows SDK) for Chrome 29.0.1547.76 m (currently the latest stable)
    var elm1 = element.FindFirst(TreeScope.Children,
      new PropertyCondition(AutomationElement.NameProperty, "Google Chrome"));
    var elm2 = TreeWalker.RawViewWalker.GetLastChild(elm1); // I don't know a Condition for this for finding :(
    var elm3 = elm2.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, ""));
    var elm4 = elm3.FindFirst(TreeScope.Children,
      new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.ToolBar));
    var result = elm4.FindFirst(TreeScope.Children,
      new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.Custom));

    return result;
  

【讨论】:

【参考方案6】:

我发现了这篇文章,并能够使用这些方法成功地从 C# 中的 chrome 中提取 URL,谢谢大家!

不幸的是随着最近的 Chrome 69 更新,AutomationElement 树遍历再次中断。

我发现这篇微软的文章:Navigate Among UI Automation Elements with TreeWalker

并使用它来创建一个简单的函数,该函数使用我们正在寻找的 "edit" 控件类型搜索 AutomationElement而不是遍历始终在变化的树层次结构,并从那里提取 AutomationElement 的 url 值。

我写了一个简单的类来包装这一切:Google-Chrome-URL-Check-C-Sharp。

自述文件解释了如何使用它。

总而言之,它可能只是更可靠一点,希望你们中的一些人觉得它有用。

【讨论】:

我刚刚在 github 中为您的课程添加了 2 个小修复。非常感谢。干得好! (它甚至适用于旧版 chrome 49)。【参考方案7】:

参考 Angelo Geels 的解决方案,这里是 35 版的补丁 - “try”块内的代码必须替换为:

var elm1 = elm.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, "Google Chrome"));
if (elm1 == null)  continue;  // not the right chrome.exe
var elm2 = TreeWalker.RawViewWalker.GetLastChild(elm1); // I don't know a Condition for this for finding
var elm3 = elm2.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, ""));
var elm4 = TreeWalker.RawViewWalker.GetNextSibling(elm3); // I don't know a Condition for this for finding
var elm7 = elm4.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.ToolBar));
elmUrlBar = elm7.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.Custom));  

我从这里拿走了它: http://techsupt.winbatch.com/webcgi/webbatch.exe?techsupt/nftechsupt.web+WinBatch/dotNet/System_CodeDom+Grab~URL~from~Chrome.txt

【讨论】:

【参考方案8】:

对我来说,只有活动的 chrome 窗口有 MainWindowHandle。我通过查看所有窗口中的 chrome 窗口,然后使用这些句柄来解决这个问题。例如:

    public delegate bool Win32Callback(IntPtr hwnd, IntPtr lParam);

    [DllImport("user32.dll")]
    protected static extern bool EnumWindows(Win32Callback enumProc, IntPtr lParam); 

    private static bool EnumWindow(IntPtr handle, IntPtr pointer)
    
        List<IntPtr> pointers = GCHandle.FromIntPtr(pointer).Target as List<IntPtr>;
        pointers.Add(handle);
        return true;
    

    private static List<IntPtr> GetAllWindows()
    
        Win32Callback enumCallback = new Win32Callback(EnumWindow);
        List<IntPtr> pointers = new List<IntPtr>();
        GCHandle listHandle = GCHandle.Alloc(pointers);
        try
        
            EnumWindows(enumCallback, GCHandle.ToIntPtr(listHandle));
        
        finally
        
            if (listHandle.IsAllocated) listHandle.Free();
        
        return pointers;
    

然后获取所有 chrome 窗口:

    [DllImport("User32", CharSet = CharSet.Auto, SetLastError = true)]
    public static extern int GetWindowText(IntPtr windowHandle, StringBuilder stringBuilder, int nMaxCount);

    [DllImport("user32.dll", EntryPoint = "GetWindowTextLength", SetLastError = true)]
    internal static extern int GetWindowTextLength(IntPtr hwnd);
    private static string GetTitle(IntPtr handle)
    
        int length = GetWindowTextLength(handle);
        StringBuilder sb = new StringBuilder(length + 1);
        GetWindowText(handle, sb, sb.Capacity);
        return sb.ToString();
    

最后:

GetAllWindows()
    .Select(GetTitle)
    .Where(x => x.Contains("Google Chrome"))
    .ToList()
    .ForEach(Console.WriteLine);

希望这可以节省其他人一些时间来弄清楚如何实际获取所有 chrome 窗口的句柄。

【讨论】:

这实际上会获取标题中包含“Google Chrome”的任何窗口。 (例如,在 Internet Explorer 中打开此网页。)此外,由于某种原因,返回的列表有一个额外的元素... 是的,但是一旦您运行其他答案中提到的算法,您就可以根据您是否可以获得 url 来确定哪些窗口是 chrome。任何更具包容性的搜索都是可以的,对我来说问题是搜索过于排他(只给我专注的 chrome 窗口的进程)。【参考方案9】:

对于 53.0.2785 版本,它可以使用:

var elm1 = elm.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, "Google Chrome"));
                if (elm1 == null)  continue;  // not the right chrome.exe
                var elm2 = elm1.FindAll(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, ""))[1];
                var elm3 = elm2.FindAll(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, ""))[1];
                var elm4 = elm3.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, "principal"));
                var elm5 = elm4.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.NameProperty, ""));
                elmUrlBar = elm5.FindFirst(TreeScope.Children, new PropertyCondition(AutomationElement.ControlTypeProperty, ControlType.Edit));

【讨论】:

以上是关于使用 C# 从 Google Chrome 获取当前标签页的 URL的主要内容,如果未能解决你的问题,请参考以下文章

如何使用 C# 启动具有特定 URL 的 Google Chrome 选项卡

如何使用 C# 在 WPF 中开发类似 Google Chrome 浏览器的应用程序

无法从 C# 中的 google api 令牌验证器获取数据

允许 Google Chrome 使用 XMLHttpRequest 从本地文件加载 URL

从浏览器获取 URL 到 C# 应用程序

如何从 C# 代码调用 Google 地理编码服务