sbt-assembly:发现重复数据删除错误

Posted

技术标签:

【中文标题】sbt-assembly:发现重复数据删除错误【英文标题】:sbt-assembly: deduplication found error 【发布时间】:2014-09-28 10:47:56 【问题描述】:

我不确定合并策略或排除 jars 是否是这里的最佳选择。任何有关如何进一步解决此错误的帮助都会很棒!

[sameert@pzxdcc0151 approxstrmatch]$ sbt assembly
[info] Loading project definition from /apps/sameert/software/approxstrmatch/project
[info] Set current project to approxstrmatch (in build file:/apps/sameert/software/approxstrmatch/)
[info] Including from cache: scala-library.jar
[info] Checking every *.class/*.jar file's SHA-1.
[info] Merging files...
[info] Including from cache: curator-client-2.4.0.jar
[info] Including from cache: secondstring-20140729.jar
[info] Including from cache: slf4j-api-1.7.5.jar
[info] Including from cache: jsr305-1.3.9.jar
[info] Including from cache: jul-to-slf4j-1.7.5.jar
[info] Including from cache: jcl-over-slf4j-1.7.5.jar
[info] Including from cache: commons-digester-1.8.jar
[info] Including from cache: compress-lzf-1.0.0.jar
[info] Including from cache: commons-beanutils-1.7.0.jar
[info] Including from cache: zookeeper-3.4.5.jar
[info] Including from cache: slf4j-log4j12-1.7.5.jar
[info] Including from cache: commons-beanutils-core-1.8.0.jar
[info] Including from cache: commons-net-2.2.jar
[info] Including from cache: commons-el-1.0.jar
[info] Including from cache: log4j-1.2.17.jar
[info] Including from cache: scala-library.jar
[info] Including from cache: jline-0.9.94.jar
[info] Including from cache: snappy-java-1.0.5.jar
[info] Including from cache: hsqldb-1.8.0.10.jar
[info] Including from cache: chill_2.10-0.3.6.jar
[info] Including from cache: oro-2.0.8.jar
[info] Including from cache: chill-java-0.3.6.jar
[info] Including from cache: kryo-2.21.jar
[info] Including from cache: reflectasm-1.07-shaded.jar
[info] Including from cache: minlog-1.2.jar
[info] Including from cache: guava-14.0.1.jar
[info] Including from cache: jetty-plus-8.1.14.v20131031.jar
[info] Including from cache: javax.transaction-1.1.1.v201105210645.jar
[info] Including from cache: jackson-mapper-asl-1.8.8.jar
[info] Including from cache: jackson-core-asl-1.8.8.jar
[info] Including from cache: jetty-webapp-8.1.14.v20131031.jar
[info] Including from cache: curator-recipes-2.4.0.jar
[info] Including from cache: jetty-xml-8.1.14.v20131031.jar
[info] Including from cache: spark-core_2.10-1.0.0.jar
[info] Including from cache: objenesis-1.2.jar
[info] Including from cache: curator-framework-2.4.0.jar
[info] Including from cache: hadoop-client-1.0.4.jar
[info] Including from cache: jetty-util-8.1.14.v20131031.jar
[info] Including from cache: scalap-2.10.4.jar
[info] Including from cache: akka-remote_2.10-2.2.3-shaded-protobuf.jar
[info] Including from cache: jetty-servlet-8.1.14.v20131031.jar
[info] Including from cache: jetty-security-8.1.14.v20131031.jar
[info] Including from cache: jetty-server-8.1.14.v20131031.jar
[info] Including from cache: javax.servlet-3.0.0.v201112011016.jar
[info] Including from cache: jetty-continuation-8.1.14.v20131031.jar
[info] Including from cache: jetty-http-8.1.14.v20131031.jar
[info] Including from cache: jetty-io-8.1.14.v20131031.jar
[info] Including from cache: hadoop-core-1.0.4.jar
[info] Including from cache: jetty-jndi-8.1.14.v20131031.jar
[info] Including from cache: xmlenc-0.52.jar
[info] Including from cache: commons-codec-1.4.jar
[info] Including from cache: javax.mail.glassfish-1.4.1.v201005082020.jar
[info] Including from cache: javax.activation-1.1.0.v201105071233.jar
[info] Including from cache: commons-math-2.1.jar
[info] Including from cache: commons-lang3-3.3.2.jar
[info] Including from cache: commons-configuration-1.6.jar
[info] Including from cache: metrics-core-3.0.0.jar
[info] Including from cache: metrics-jvm-3.0.0.jar
[info] Including from cache: metrics-json-3.0.0.jar
[info] Including from cache: commons-collections-3.2.1.jar
[info] Including from cache: metrics-graphite-3.0.0.jar
[info] Including from cache: commons-lang-2.4.jar
[info] Including from cache: akka-actor_2.10-2.2.3-shaded-protobuf.jar
[info] Including from cache: config-1.0.2.jar
[info] Including from cache: tachyon-0.4.1-thrift.jar
[info] Including from cache: netty-3.6.6.Final.jar
[info] Including from cache: protobuf-java-2.4.1-shaded.jar
[info] Including from cache: uncommons-maths-1.2.2a.jar
[info] Including from cache: akka-slf4j_2.10-2.2.3-shaded-protobuf.jar
[info] Including from cache: json4s-jackson_2.10-3.2.6.jar
[info] Including from cache: json4s-core_2.10-3.2.6.jar
[info] Including from cache: ant-1.9.0.jar
[info] Including from cache: json4s-ast_2.10-3.2.6.jar
[info] Including from cache: ant-launcher-1.9.0.jar
[info] Including from cache: paranamer-2.6.jar
[info] Including from cache: commons-io-2.4.jar
[info] Including from cache: jackson-core-2.3.0.jar
[info] Including from cache: pyrolite-2.0.1.jar
[info] Including from cache: colt-1.2.0.jar
[info] Including from cache: concurrent-1.3.4.jar
[info] Including from cache: py4j-0.8.1.jar
[info] Including from cache: mesos-0.18.1-shaded-protobuf.jar
[info] Including from cache: scala-compiler.jar
[info] Including from cache: jets3t-0.7.1.jar
[info] Including from cache: commons-httpclient-3.1.jar
[info] Including from cache: netty-all-4.0.17.Final.jar
[info] Including from cache: stream-2.5.1.jar
[info] Including from cache: scala-reflect.jar
[info] Including from cache: jackson-databind-2.3.0.jar
[info] Including from cache: jackson-annotations-2.3.0.jar
[warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
[warn] Strategy 'discard' was applied to a file
[info] Checking every *.class/*.jar file's SHA-1.
[info] Merging files...
[warn] Merging 'META-INF/DEPENDENCIES' with strategy 'discard'
[info] Assembly up to date: /apps/sameert/software/approxstrmatch/app/target/scala-2.10/app-assembly-0.1-SNAPSHOT.jar

// 这是我开始看到错误的地方:

java.lang.RuntimeException: deduplicate: different file contents found in the following:
/home/sameert/.ivy2/cache/org.eclipse.jetty.orbit/javax.transaction/orbits/javax.transaction-1.1.1.v201105210645.jar:META-INF/ECLIPSEF.RSA
/home/sameert/.ivy2/cache/org.eclipse.jetty.orbit/javax.servlet/orbits/javax.servlet-3.0.0.v201112011016.jar:META-INF/ECLIPSEF.RSA
/home/sameert/.ivy2/cache/org.eclipse.jetty.orbit/javax.mail.glassfish/orbits/javax.mail.glassfish-1.4.1.v201005082020.jar:META-INF/ECLIPSEF.RSA
/home/sameert/.ivy2/cache/org.eclipse.jetty.orbit/javax.activation/orbits/javax.activation-1.1.0.v201105071233.jar:META-INF/ECLIPSEF.RSA
    at sbtassembly.Plugin$Assembly$.sbtassembly$Plugin$Assembly$$applyStrategy$1(Plugin.scala:253)
    at sbtassembly.Plugin$Assembly$$anonfun$15.apply(Plugin.scala:270)
    at sbtassembly.Plugin$Assembly$$anonfun$15.apply(Plugin.scala:267)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
    at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
    at sbtassembly.Plugin$Assembly$.applyStrategies(Plugin.scala:272)
    at sbtassembly.Plugin$Assembly$.x$4$lzycompute$1(Plugin.scala:172)
    at sbtassembly.Plugin$Assembly$.x$4$1(Plugin.scala:170)
    at sbtassembly.Plugin$Assembly$.stratMapping$lzycompute$1(Plugin.scala:170)
    at sbtassembly.Plugin$Assembly$.stratMapping$1(Plugin.scala:170)
    at sbtassembly.Plugin$Assembly$.inputs$lzycompute$1(Plugin.scala:214)
    at sbtassembly.Plugin$Assembly$.inputs$1(Plugin.scala:204)
    at sbtassembly.Plugin$Assembly$.apply(Plugin.scala:230)
    at sbtassembly.Plugin$Assembly$$anonfun$assemblyTask$1.apply(Plugin.scala:373)
    at sbtassembly.Plugin$Assembly$$anonfun$assemblyTask$1.apply(Plugin.scala:370)
    at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47)
    at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:42)
    at sbt.std.Transform$$anon$4.work(System.scala:64)
    at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:237)
    at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:237)
    at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:18)
    at sbt.Execute.work(Execute.scala:244)
    at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:237)
    at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:237)
    at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:160)
    at sbt.CompletionService$$anon$2.call(CompletionService.scala:30)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)

//这里是错误信息。

[error] (approxstrmatch/*:assembly) deduplicate: different file contents found in the following:
[error] /home/sameert/.ivy2/cache/org.eclipse.jetty.orbit/javax.transaction/orbits/javax.transaction-1.1.1.v201105210645.jar:META-INF/ECLIPSEF.RSA
[error] /home/sameert/.ivy2/cache/org.eclipse.jetty.orbit/javax.servlet/orbits/javax.servlet-3.0.0.v201112011016.jar:META-INF/ECLIPSEF.RSA
[error] /home/sameert/.ivy2/cache/org.eclipse.jetty.orbit/javax.mail.glassfish/orbits/javax.mail.glassfish-1.4.1.v201005082020.jar:META-INF/ECLIPSEF.RSA
[error] /home/sameert/.ivy2/cache/org.eclipse.jetty.orbit/javax.activation/orbits/javax.activation-1.1.0.v201105071233.jar:META-INF/ECLIPSEF.RSA
[error] Total time: 4 s, completed Aug 5, 2014 9:53:06 AM

【问题讨论】:

【参考方案1】:

将以下代码添加到您的 build.sbt 文件中

assemblyMergeStrategy in assembly := 
 case PathList("META-INF", xs @ _*) => MergeStrategy.discard
 case x => MergeStrategy.first

这对我帮助很大。

【讨论】:

实际上我认为我们应该只覆盖 META-INF 的合并策略,其余的保留旧策略,所以:assemblyMergeStrategy in assembly := case PathList("META-INF", xs @ _*) => MergeStrategy.discard\n case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) 更多详情可以参考sbt-assembly/Merge Strategy 为我工作,从未听说过任何人。 效果很好,但对它的作用做一个小小的解释不会有什么坏处 它不像在 Play 2.5 中那样工作;所以,我用mergeStrategy 替换了assemblyMergeStrategy,它成功了!【参考方案2】:

使用 "provided" 配置,它将限定您的依赖库。

例如:

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.1.0" % "provided"

如果需要,请参阅

https://github.com/sbt/sbt-assembly#excluding-jars-and-files

【讨论】:

抱歉,DOH 脚本中有一个“跳过更新”。这确实有效,并且比其他答案简单得多。 使用“提供”也适用于我。它将排除提供的依赖项并保持 jar 包轻。但是,如果我想打包所有依赖项/jar,该怎么做?【参考方案3】:
import AssemblyKeys._

name := "approxstrmatch"

version := "1.0"

scalaVersion := "2.10.4"

// unmanagedJars in Compile += file("lib/secondstring-20140729.jar")

libraryDependencies+="org.apache.spark"%%"spark-core"%"1.0.0"

libraryDependencies ++= Seq(
    ("org.apache.spark"%%"spark-core"%"1.0.0").
    exclude("org.eclipse.jetty.orbit", "javax.servlet").
    exclude("org.eclipse.jetty.orbit", "javax.transaction").
    exclude("org.eclipse.jetty.orbit", "javax.mail").
     exclude("org.eclipse.jetty.orbit", "javax.activation").
    exclude("commons-beanutils", "commons-beanutils-core").
    exclude("commons-collections", "commons-collections").
    exclude("commons-collections", "commons-collections").
    exclude("com.esotericsoftware.minlog", "minlog")
)


resolvers += "AkkaRepository" at "http://repo.akka.io/releases/"



 lazy val app = Project("approxstrmatch", file("approxstrmatch"),
    settings = buildSettings ++ assemblySettings ++ Seq(
    mergeStrategy in assembly <<= (mergeStrategy in assembly)  (old) =>
    
        case PathList("javax", "servlet", xs @ _*)         => MergeStrategy.first
        case PathList("javax", "transaction", xs @ _*)     => MergeStrategy.first
        case PathList("javax", "mail", xs @ _*)     => MergeStrategy.first
        case PathList("javax", "activation", xs @ _*)     => MergeStrategy.first
        case PathList(ps @ _*) if ps.last endsWith ".html" => MergeStrategy.first
        case "application.conf" => MergeStrategy.concat
        case "unwanted.txt"     => MergeStrategy.discard
        case x => old(x)
        
    )
  )


mainClass in assembly := Some("approxstrmatch.JaccardScore")
// jarName in assembly := "approstrmatch.jar"

【讨论】:

【参考方案4】:

module-info.clas 文件已移入许多库。 这是更新的解决方案

ThisBuild / assemblyMergeStrategy  := 
  case PathList("module-info.class") => MergeStrategy.discard
  case x if x.endsWith("/module-info.class") => MergeStrategy.discard
  case x =>
    val oldStrategy = (ThisBuild / assemblyMergeStrategy).value
    oldStrategy(x)

【讨论】:

以上是关于sbt-assembly:发现重复数据删除错误的主要内容,如果未能解决你的问题,请参考以下文章

MySQL删除数据库时的错误(errno: 39)

删除重复错误,未定义数组 [x]

如何解决Oracle“不能创建唯一索引,发现重复记录”问题

mysql 如何去除表连接查询出来的重复数据

ORACLE删除表中重复数据

orcl数据库查询重复数据及删除重复数据方法