具有更改日志的非重复爬网的自定义 BCS 索引连接器无法正常工作
Posted
技术标签:
【中文标题】具有更改日志的非重复爬网的自定义 BCS 索引连接器无法正常工作【英文标题】:Custom BCS indexing connector with changelog inremental crawl is not working properly 【发布时间】:2013-06-02 05:02:41 【问题描述】:我正在使用更改日志增量抓取方法编写自定义索引连接器。
我正在使用来自 http://msdn.microsoft.com/en-us/library/ff625800%28v=office.14%29.aspx 的示例并尝试为我更改它。
我的模型有下一个原型:IdEnumerator、ChangedIdEnumerator、DeletedIdEnumerator、SpecificFinder、Finder、StreamAccessor
如果我开始完全抓取,则会调用 IdEnumerator、ChangedIdEnumerator、DeletedIdEnumerator。
第一个问题:SpecificFinder 没有被调用。
如果我开始增量抓取,将调用 ChangedIdEnumerator 和 DeletedIdEnumerator。
DeletedIdEnumerator 正在工作:具有已删除 ID 的项目将从索引中删除。
第二个问题:ChangedIdEnumerator 不起作用。我返回更改后的 ID 后没有任何反应。
现在 crowl 日志中有错误。
我的模型在这里:
<Model xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Name="MyFileModel" xmlns="http://schemas.microsoft.com/windows/2007/BusinessDataCatalog">
<LobSystems>
<LobSystem Name="MyFileSystem" Type="Custom">
<Properties>
<Property Name="SystemUtilityTypeName" Type="System.String">MyFileConnector.MyFileConnector, MyFileConnector, Version=1.0.0.0, Culture=neutral, PublicKeyToken=15865f58b9878bf8</Property>
<Property Name="SystemUtilityInstallDate" Type="System.DateTime">2013-01-01 00:00:00Z</Property>
<Property Name="InputUriProcessor" Type="System.String">MyFileConnector.MyFileLobUri, MyFileConnector, Version=1.0.0.0, Culture=neutral, PublicKeyToken=15865f58b9878bf8</Property>
<Property Name="OutputUriProcessor" Type="System.String">MyFileConnector.MyFileNamingContainer, MyFileConnector, Version=1.0.0.0, Culture=neutral, PublicKeyToken=15865f58b9878bf8</Property>
</Properties>
<LobSystemInstances>
<LobSystemInstance Name="MyFileConnector_instance">
<Properties>
<Property Name="AuthenticationType" Type="System.String">Credentials</Property>
</Properties>
</LobSystemInstance>
</LobSystemInstances>
<Entities>
<Entity Name="MyFolder" Namespace="MyFileConnector" Version="1.0.0.1">
<Properties>
<Property Name="Title" Type="System.String">Name</Property>
</Properties>
<Identifiers>
<Identifier Name="ID" TypeName="System.String" />
</Identifiers>
<Methods>
<!-- IdEnumerator -->
<Method Name="ReadAllIds" DefaultDisplayName="ReadAllIds" IsStatic="false">
<Parameters>
<Parameter Name="returnIds" Direction="Return">
<TypeDescriptor Name="Nodes" TypeName="Microsoft.BusinessData.Runtime.DynamicType[]" IsCollection="true">
<TypeDescriptors>
<TypeDescriptor TypeName="Microsoft.BusinessData.Runtime.DynamicType" Name="Node">
<TypeDescriptors>
<TypeDescriptor Name="ID" TypeName="System.String" IdentifierName="ID" />
</TypeDescriptors>
</TypeDescriptor>
</TypeDescriptors>
</TypeDescriptor>
</Parameter>
</Parameters>
<MethodInstances>
<MethodInstance Type="IdEnumerator" Name="ReadAllIds" DefaultDisplayName="ReadAllIds" ReturnParameterName="returnIds" Default="true">
<Properties>
<Property Name="RootFinder" Type="System.String">true</Property>
</Properties>
<AccessControlList>
<AccessControlEntry Principal="NT AUTHORITY\Authenticated Users">
<Right BdcRight="Execute" />
</AccessControlEntry>
<AccessControlEntry Principal="NT AUTHORITY\System">
<Right BdcRight="SetPermissions"/>
</AccessControlEntry>
</AccessControlList>
</MethodInstance>
</MethodInstances>
</Method>
<!-- ChangedIdEnumerator -->
<Method Name="ReadIncrementalList" IsStatic="false">
<FilterDescriptors>
<FilterDescriptor Name="LastCrawl" Type="InputOutput">
<Properties>
<Property Name="SynchronizationCookie" Type="System.String">x</Property>
</Properties>
</FilterDescriptor>
<FilterDescriptor Name="Timestamp" Type="Timestamp" />
</FilterDescriptors>
<Parameters>
<Parameter Name="lastCrawlDate" Direction="InOut">
<TypeDescriptor Name="LastCrawlDate" TypeName="System.DateTime" IsCollection="false" AssociatedFilter="LastCrawl">
<Interpretation>
<NormalizeDateTime LobDateTimeMode="Local" />
</Interpretation>
</TypeDescriptor>
</Parameter>
<Parameter Name="returnIds" Direction="Return">
<TypeDescriptor TypeName="Microsoft.BusinessData.Runtime.DynamicType[]" Name="Nodes" IsCollection="true" >
<TypeDescriptors>
<TypeDescriptor TypeName="Microsoft.BusinessData.Runtime.DynamicType" Name="Node">
<TypeDescriptors>
<TypeDescriptor TypeName="System.String" IdentifierName="ID" Name="ID" />
</TypeDescriptors>
</TypeDescriptor>
</TypeDescriptors>
</TypeDescriptor>
</Parameter>
</Parameters>
<MethodInstances>
<MethodInstance Name="ReadIncrementalListInstance" Type="ChangedIdEnumerator" ReturnParameterName="returnIds" Default="true">
<AccessControlList>
<AccessControlEntry Principal="NT AUTHORITY\Authenticated Users">
<Right BdcRight="Execute" />
<Right BdcRight="SetPermissions" />
</AccessControlEntry>
</AccessControlList>
</MethodInstance>
</MethodInstances>
</Method>
<!-- DeletedIdEnumerator -->
<Method Name="ReadDeletedIncrementalList" IsStatic="false" DefaultDisplayName="ReadDeletedIncrementalList">
<FilterDescriptors>
<FilterDescriptor Name="LastCrawl" Type="InputOutput">
<Properties>
<Property Name="SynchronizationCookie" Type="System.String">x</Property>
</Properties>
</FilterDescriptor>
<FilterDescriptor Name="Timestamp" Type="Timestamp" />
</FilterDescriptors>
<Parameters>
<Parameter Name="LastCrawlDate" Direction="InOut">
<TypeDescriptor Name="LastCrawlDate" TypeName="System.DateTime" IsCollection="false" AssociatedFilter="LastCrawl">
<Interpretation>
<NormalizeDateTime LobDateTimeMode="Local" />
</Interpretation>
</TypeDescriptor>
</Parameter>
<Parameter Name="deletedIds" Direction="Return">
<TypeDescriptor TypeName="Microsoft.BusinessData.Runtime.DynamicType[]" Name="Nodes" IsCollection="true">
<TypeDescriptors>
<TypeDescriptor TypeName="Microsoft.BusinessData.Runtime.DynamicType" Name="Node">
<TypeDescriptors>
<TypeDescriptor Name="ID" TypeName="System.String" IdentifierName="ID" />
</TypeDescriptors>
</TypeDescriptor>
</TypeDescriptors>
</TypeDescriptor>
</Parameter>
</Parameters>
<MethodInstances>
<MethodInstance Name="ReadDeletedIncrementalListInstance" Type="DeletedIdEnumerator" ReturnParameterName="deletedIds">
<AccessControlList>
<AccessControlEntry Principal="NT AUTHORITY\Authenticated Users">
<Right BdcRight="Execute" />
<Right BdcRight="SetPermissions" />
</AccessControlEntry>
</AccessControlList>
</MethodInstance>
</MethodInstances>
</Method>
<!-- Finder -->
<Method Name="ReadAllItems" DefaultDisplayName="ReadAllItems" IsStatic="false">
<Parameters>
<Parameter Name="returnAllItems" Direction="Return">
<TypeDescriptor TypeName="Microsoft.BusinessData.Runtime.DynamicType[]" Name="Nodes" IsCollection="true" >
<TypeDescriptors>
<TypeDescriptor TypeName="Microsoft.BusinessData.Runtime.DynamicType" Name="Node">
<TypeDescriptors>
<TypeDescriptor TypeName="System.String" IdentifierName="ID" Name="ID" />
<TypeDescriptor TypeName="System.String" Name="Name" />
<TypeDescriptor TypeName="System.String" Name="Title" />
<TypeDescriptor TypeName="System.String" Name="Path" />
</TypeDescriptors>
</TypeDescriptor>
</TypeDescriptors>
</TypeDescriptor>
</Parameter>
</Parameters>
<MethodInstances>
<MethodInstance Type="Finder" Name="ReadAllItems" DefaultDisplayName="ReadAllItems" ReturnParameterName="returnAllItems" Default="true" ReturnTypeDescriptorName="Nodes" ReturnTypeDescriptorLevel="0">
<AccessControlList>
<AccessControlEntry Principal="NT AUTHORITY\Authenticated Users">
<Right BdcRight="Execute" />
</AccessControlEntry>
<AccessControlEntry Principal="NT AUTHORITY\System">
<Right BdcRight="SetPermissions"/>
</AccessControlEntry>
</AccessControlList>
</MethodInstance>
</MethodInstances>
</Method>
<!-- SpecificFinder -->
<Method Name="ReadItem" DefaultDisplayName="ReadItem" IsStatic="false">
<Parameters>
<Parameter Direction="In" Name="ID">
<TypeDescriptor TypeName="System.String" IdentifierName="ID" Name="ID" />
</Parameter>
<Parameter Direction="Return" Name="returnParameter">
<TypeDescriptor TypeName="Microsoft.BusinessData.Runtime.DynamicType" Name="Node">
<TypeDescriptors>
<TypeDescriptor TypeName="System.String" IdentifierName="ID" Name="ID" ReadOnly="true" />
<TypeDescriptor TypeName="System.String" Name="Title" />
<TypeDescriptor TypeName="System.String" Name="Author" />
</TypeDescriptors>
</TypeDescriptor>
</Parameter>
</Parameters>
<MethodInstances>
<MethodInstance Type="SpecificFinder" ReturnParameterName="returnParameter" ReturnTypeDescriptorName="Node" Default="true" Name="ReadItem" DefaultDisplayName="ReadItem" ReturnTypeDescriptorLevel="0">
<AccessControlList>
<AccessControlEntry Principal="NT AUTHORITY\Authenticated Users">
<Right BdcRight="Execute" />
</AccessControlEntry>
<AccessControlEntry Principal="NT AUTHORITY\System">
<Right BdcRight="SetPermissions"/>
</AccessControlEntry>
</AccessControlList>
</MethodInstance>
</MethodInstances>
</Method>
</Methods>
</Entity>
</Entities>
</LobSystem>
我做错了什么?非常感谢任何意见。
【问题讨论】:
【参考方案1】:我最近在我的自定义 BCS 连接器中遇到了类似的问题(未调用 SpecificFinder)并设法解决了它。在我的场景中,我有两个实体(父实体和子实体),SpecificFinder 只为父实体调用,但不为子实体调用。 事实证明,这个问题与我构建“访问 URI”的方式有关。最初的 URI 是这样的:
<protocol>://<entity_name>/<entity_id>
我的起始 URL(在内容源定义中指定)是“假”父实体的 URL(没有任何 ID):
<protocol>://<parent_entity_name>
但似乎 SharePoint 爬网程序以与 Web URL 相同的方式处理访问 URI,并按内容源定义中指定的 URL 路径应用过滤器。换句话说,就我而言,它只会抓取与以下模式相对应的 URI:
<protocol>://<parent_entity_name>/*
在我将访问 URI 格式更改为
之后<protocol>://root/<entity_name>
并将内容源定义中的起始 URL 设置为
<protocol>://root
一切都开始正常工作了。
【讨论】:
【参考方案2】:您的 SynchronizationCookie ("x") 具有相同的名称,为您的每个方法指定不同的 cookie 名称
【讨论】:
以上是关于具有更改日志的非重复爬网的自定义 BCS 索引连接器无法正常工作的主要内容,如果未能解决你的问题,请参考以下文章
与 Django Rest Framework 的非用户连接的自定义身份验证