上传本地tiff文件到Hadoop

Posted 2020-08-10 王小豆的烂笔头

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了上传本地tiff文件到Hadoop相关的知识，希望对你有一定的参考价值。

　　使用Geotellis的API上传本地Tiff文件到Hadoop

　　涉及依赖：

        <dependency>
            <groupId>org.locationtech.geotrellis</groupId>
            <artifactId>geotrellis-spark_2.11</artifactId>
            <version>2.1.0</version>
        </dependency>
        <dependency>
            <groupId>org.locationtech.geotrellis</groupId>
            <artifactId>geotrellis-hbase_2.11</artifactId>
            <version>2.1.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.1.0</version>
        </dependency>

　　核心代码(scala语言)

object UploadTiff {
  val inputPath = "H:\\BigData\\tiff\\GF1_PMS2_E120.6_N31.0_20151203_L1A0001216039_fusion.tif"
  val outputPath = "hdfs://localhost:9000/catalogLatLng"

  def main(args: Array[String]): Unit = {
    // Setup Spark to use Kryo serializer.
    val conf =
      new SparkConf()
        .setMaster("local[*]")
        .setAppName("Spark Tiler")
        .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
        .set("spark.kryo.registrator", "geotrellis.spark.io.kryo.KryoRegistrator")

    val sc = new SparkContext(conf)
    try {
      run(sc)
      // Pause to wait to close the spark context,
      // so that you can check out the UI at http://localhost:4040
      println("Hit enter to exit.")
      StdIn.readLine()
    } finally {
      sc.stop()
    }
  }

  def fullPath(path: String) = new java.io.File(path).getAbsolutePath

  def run(implicit sc: SparkContext) = {
    // Read the geotiff in as a single image RDD,
    // using a method implicitly added to SparkContext by
    // an implicit class available via the
    // "import geotrellis.spark.io.hadoop._ " statement.
    val inputRdd: RDD[(ProjectedExtent, MultibandTile)] = {
      sc.hadoopMultibandGeoTiffRDD(inputPath)
    }

    // Use the "TileLayerMetadata.fromRdd" call to find the zoom
    // level that the closest match to the resolution of our source image,
    // and derive information such as the full bounding box and data type.
    val (_, rasterMetaData) =
    TileLayerMetadata.fromRDD(inputRdd, FloatingLayoutScheme(512))

    // Use the Tiler to cut our tiles into tiles that are index to a floating layout scheme.
    // We‘ll repartition it so that there are more partitions to work with, since spark
    // likes to work with more, smaller partitions (to a point) over few and large partitions.
    val tiled: RDD[(SpatialKey, MultibandTile)] =
    inputRdd
      .tileToLayout(rasterMetaData.cellType, rasterMetaData.layout, Bilinear)
      .repartition(100)

    // We‘ll be tiling the images using a zoomed layout scheme
    // in the web mercator format (which fits the slippy map tile specification).
    // We‘ll be creating 256 x 256 tiles.
    val layoutScheme = ZoomedLayoutScheme(LatLng, tileSize = 256)

    // We need to reproject the tiles to WebMercator
    val (zoom, reprojected): (Int, RDD[(SpatialKey, MultibandTile)] with Metadata[TileLayerMetadata[SpatialKey]]) =
      MultibandTileLayerRDD(tiled, rasterMetaData)
        .reproject(LatLng, layoutScheme, Bilinear)

    // Create the attributes store that will tell us information about our catalog.
    //    val attributeStore = HadoopAttributeStore(outputPath)
    var zookpeers: Seq[String] = List("localhost:2181")
    var master: String = "localhost"
   
    val hbaseInstance = HBaseInstance(zookpeers, master)
    val attributeStore = HBaseAttributeStore(hbaseInstance)

    // Create the writer that we will use to store the tiles in the local catalog.
    //    val writer = HadoopLayerWriter(outputPath, attributeStore)
    var catalog: String = "catalogLatLng"
    val writer = HBaseLayerWriter(attributeStore, catalog)

    // Pyramiding up the zoom levels, write our tiles out to the local file system.
    Pyramid.upLevels(reprojected, layoutScheme, zoom, Bilinear) { (rdd, z) =>
      val layerId = LayerId("landsat", z)
      // If the layer exists already, delete it out before writing
      if (attributeStore.layerExists(layerId)) {
        //        new HadoopLayerManager(attributeStore).delete(layerId)
        new HBaseLayerManager(attributeStore, hbaseInstance).delete(layerId)
      }
      writer.write(layerId, rdd, ZCurveKeyIndexMethod)
    }
  }
}

以上是关于上传本地tiff文件到Hadoop的主要内容，如果未能解决你的问题，请参考以下文章

我用linux搭建hadoop集群,在本地我用eclipse向节点上传文件但是文件没有上传到hdfs节点上

关于用java写程序把本地文件上传到HDFS中的问题

Hadoop hdfs上传文件权限问题

在本地eclipse上开发hadoop程序，将文件上传到hdfs上时用下面程序，老是找不到文件，问啥

一起学Hadoop——文件的上传分发与打包

关于本地向虚拟机上的Hadoop上传文件时报错解决方案