Hot Topics on Data Center (HotDC) 2018

Posted tinoryj

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Hot Topics on Data Center (HotDC) 2018相关的知识,希望对你有一定的参考价值。

Keynote Session

Accelerate Machine Intelligence: An Edge to Cloud Continuum

Hadi Esmaeilzadeh - UCSD

Background

open source: http://act-lab.org/artifacts

CoSMIC stack

how to distribute
  • understanding machine learning - solving optimize problem
  • abstraction between algorithm and acceleration system - parallelized stochastic gradient descent solver(to fpga gpu asic cgra xeon phi)
  • leverage linearity of differentiation for distributed learning
  • programming and compilation
    • build a new language for math
    • dataflow graph generation
how to design customizable accelerator
  • multi-threading acceleration
  • connectivity and bussing
  • PE architecture - make hardware simple
how to reduce overhead of distributed coordination

specialized system software in CoSIMC

benchmarks
  • 16-node CoSIMC with UltraScale+FPGA offer 18.8x speedup over 16-node spark with E3 skylake cpu
  • using FPGA (66%) and software (34%) for speedup

RoboX Accelerator Architecture

DNNs tolerate low-bitwidth operations - bit-level

Making Cloud Systems Reliable and Dependable: Challenges and Opportunities

Lidong Zhou- MSRA

Background

system reliability:

  • Fault Tolerance
  • Redundancies
  • State Machine Replication
  • Paxos
  • Erasure Coding

Real-World Gray Failures in Cloud

  • redundancies in data center networking
  • active device and link failure localization in data center
  • NetBouncer: large-Scale path probing and diagnosis
  • NetBouncer: leverage the power of scale
  • root cause of the gray failure - stuck due to network issue - heart beat still normal (request stuck)
  • Insight: should detect what the requesters errors
    • critical gray failure are ovserviable
    • from error handling to error reporting

Solution - Panorama

  • Analysis - automatically covert a software component into an in-situ observer
  • Runtime - observer send to local observation store(LOS)
    • locate ob-boundary
    • observations not always direct
    • observations split to ob-origin & ob-sink
    • match ob-origin & ob-sink
  • Detect what "requesters" see

Reliability of Large-Scale Distributed Systems

  • foundation reliability
  • rethink cloud reliability: new theory & new method
  • understand gray failure
  • systematic and comprehensive observations

paper: Gray Failure: The Achilles‘ Heel of Cloud-Scale Systems

Deconstructing RDMA-enabled Distributed Transactions: Hybrid is Better!

Haibo Chen - SJTU

Background

  • (Distributed) Transactions were slow
  • High cost for distributed TX - Usually 10s~100s of thousands of TPS - (SIGMOD‘12)
  • only 4% of wall-clock time spent in useful data processing

new features:

  • RDMA: remote direct memory access
    • ultra low latency(5us)
    • ultra high throughput
  • NVM: Non-volatile memory

An Active Line of Research of RDMA-enabled TX

  • DrTM - DrTM(SOSP 2015) DrTM-R(EuroSys 2016) DrTM-B(USENIX ATC 2017)
  • FaRM - FaRM-KV(NSDI 2014) FaRM-TX(SOSP 2015)
  • FaSST(OSDI 2016)
  • LITE(SOSP 2017)

Transaction(TX)s

  • protocols - OCC,2PL,SI...
  • impl on hardware devices - CX3,CX4,CX5,ROCE, one-side, two-side....
  • OLTP workloads - TPC-C, TPC-E, TATP, Smallbank

Main: Use RDMA in TXs

outlet:

  • RDMA primitive-level analysis
  • Phase-by-phase analysis for TX
  • DrTM+H: Putting it all together

content:

  • phase: Exe/Val/Log/Commit
  • offloading with one-side improves the performance
  • one-sided primitive has good scalability on modern RNIC
  • Execution framework & DrTM+H:https://github/com/SJTU-IPADS/drtmh

RDMA in Data Centers: from Cloud Computing to Machine Learning

Chuanxiong Guo - ByteDance

Background

  • Data Center Network (DCN) offer lot services
    • single ownership
    • large scale
    • bisection bandwidth
  • TCP/IP not working well
    • latency
    • bandwidth
    • processing overhead(40G) - 12% CPU at receiver & 6% CPU at sender

RDMA over Commodity Ethernet (RoCEv2)

  • no CPU overhead
  • single QP, 88Gb/s 1.7% CPU usage (TCP 8 connection 30-50Gb/s, client 2.6% & server 4.3% CPU)
  • RoCEv2 needs a lossless ethernet network
    • PFC(priority-based flow control) hop-by-hop flow control
    • DCQCN - sender-switch-receiver (RP-CP-NP)
  • the slow-receiver symptom - ToR tot NIC is 40Gb/s & NIC to server is 64Gb/s. NIC may generate large number of PFC pause frames

RDMA for DNN Training Acceleration

  • understanding using DNN
  • DNN Training: BP
  • Distributed ML training, GPUs, with mini-batch
  • RDMA acceleration : ResNet RNNs DNN (rdma performance better than tcp)

Highlighted Research Session

Congestion Control Mechanisms in Data Center Networks

Wei Bai - MSRA

DCN中实现低时延

  • 排队时延 -PIAS(NSDI 2015)
  • 丢包重发时延 - TLT

PIAS

  • Flow completion Time (FCT)是关键问题
  • 流信息不能假设为已知、可以在现有设备上快速部署
  • PIAS performs Multi-level feedback queue (MLFQ) to emulate shortest job first (SJF)
  • three function in pias:
    • package tagging
    • switch
    • rate control

TLT

  • 同时达到Lossy & Loss-Less两种网络的好处
  • using PFC to eliminate congestion packet losses
  • packet loss :
    • middle - fast retransmissions
    • tail - Timeout retransmissions
    • 识别重要包, 当交换机队列超过阈值时丢掉非重要包

Understanding the challenges of Scaling Distributed DNN Training

Cheng Li - USTC

  • Deep Learning growth fast
  • DNN - Deep Neural Networks
  • benefit: more data / bigger models / more computation
  • Jeff Dean - Google

Distributed DNN

  • Model or data parallelism
    • data parallelism is a primary choice
  • BSP / ASP - BSP is choice (ASP可能不收敛)
    • Bulk Synchronous Parallel - 确定时间同步
    • Asynchronous Parallel
  • net server other bottlenecks for parallelism
  • 通过测试确定影响计算能力的制约条件
    • 数据压缩传输带来的压缩开销
  • 系统设计
    • 弹性系统设计
    • 短板效应 - 最终计算速度的制约
    • 如何快速调整系统的规模等 - message bus流处理 - 用生产者消费者模型

Octopus: an RDMA-enable Distributed Persistent Memory File System

Youyou Lu - Tsinghua

  • 分布式文件系统设计
  • 非易失性内存 - 内存存储
  • DRAM Limitations
    • Cell Density
    • Refresh - 性能/功耗
  • NVDIMM内存 - 断电后存储数据
  • Intel 3D Xpoint - 接近内存的延迟, 高容量, 断电非易失
  • RDMA - 高性能环境下使用
  • DiskGluster - latency来自于HDD | MemGluster - latency来自于软件
  • RDMA-enable Distributed File System
    • shard data mamangment
    • New data flow strategies
    • Efficient RPC design
    • Concurrent control

Design

  • I/O处理
    • 将所有NVMM组织为同一空间
    • 降低DFS中的数据拷贝(7次降到4次)
    • server扫描数据存储地址,client获取地址之后自己获取(将任务转嫁给client)
  • Metadata RPC
  • Collect-Dispatch Distributed Transaction
  • 性能测试
    • 局域网服务期间测试 - 带宽可以达到网络带宽的88%
    • 在Hadoop平台下进行测试

Short Talk

Computer Organization and Design Course with FPGA Cloud, Ke Zhang (ICT, CAS)

新的技术AI IOT
提高新的软硬协同设计能力 - CPUGPUFPGAGPUASIC
ZyForce平台 - 虚拟FPGA实验

ActionFlow:A Framework for Fast Multi-Robots Application Development, Jimin Han (UCAS)

国科大大四 - 2018.8开始
机器人应用快速开发

Labeled Network Stack, Yifan Shen (ICT, CAS)

Caching or Not: Rethinking Virtual File System for Non-Volatile Main Memory, Ying Wang (ICT, CAS)

Data Motif-based Proxy Benchmarks for Big Data and AI Workloads, Chen Zheng (ICT, CAS)









以上是关于Hot Topics on Data Center (HotDC) 2018的主要内容,如果未能解决你的问题,请参考以下文章

如何获取activeMQ上的所有topic

Three Style Shoes on Nike LeBron 15 at 2018 hot sale

CS224W摘要16.Advanced Topics on GNNs

关于AndroidStudio的 Hot reload on save not working问题

使用 tf.data 的 One-hot 编码混合了列

tensorflow one_hot