Skip to content

Aggregation and Degradation in JetStream: Streaming analytics in the wide area

Abstract

用两种方法解决了数据和带宽之间的问题,解决了数据过期(tale)的问题

image-20200312123616231.png

Our adaptive control mechanisms are responsive enough to keep end-to-end latency within a few seconds, even when available bandwidth drops by a factor of two, and are flexible enough to express practical policies.

即使带宽下降两倍仍能取得低延迟

Introduction

  • 聚合(Aggregation),降级(Degradation),MapReduce
  • 传感器、存储、处理器的价格远比带宽价格便宜,于是带宽成为瓶颈,或者带宽被过量供应了(因为不能自适应)
  • 降级(Dgradation)往往带来准确性(accuracy)的降低,于是作者希望使用最低程度的降级
  • 将聚合(Aggregation),降级(Degradation)融合进串流(streaming)系统的挑战
  • 存储系统支持实时(real-time)聚合(Aggregation)
  • 实施秒级的降级(adaptation performed on a timescale of seconds)来取得低延迟
  • 用户能够使用足够强大的(expressive)语言来自主定义策略,
  • We consider our architecture and its associated interfaces to be the key contribution of this paper. 作者认为他们的系统架构和API是关键贡献

Design Overview

  • Integrating structured storage
  • Reducing data volumes
  • Programming model

Adaptive Degradation

image-20200312113022097

  • many useful degradations have a ==data-dependent== bandwidth savings
  • data since the last marker was generated over k seconds,records the time t between seeing the last marker and receiving this acknowledgment,use k/t,as availablility
  • if (k > t), k/t > 1,means bandwidth is enough
  • if(k<t),k/t < 1,means bandwidth becomes scarce
  • “By default, send all images at maximum fidelity from CCTV cameras to a central repository. If bandwidth is insufficient, switch to sending images at 75% fidelity, then 50% if there still isn’t enough bandwidth. Beyond that point, reduce the frame rate, but keep the images at 50% fidelity.”

Degradation

The best degradation for a given application depends not only on the statistics of the data, but also on the set of queries that may be applied to the data.

image-20200312121508169

Evaluation

image-20200312121033768

总结

本文的侧重点在系统,在传统的流处理系统的聚合部分,对于降级部分,有个比较好的开端。