COMP9313 Week9a-0

Posted cheviszhang

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了COMP9313 Week9a-0相关的知识,希望对你有一定的参考价值。

https://drive.google.com/drive/folders/13_vsxSIEU9TDg1TCjYEwOidh0x3dU6es

https://www.cse.unsw.edu.au/~cs9313/20T2/slides/L8.pdf

 

Mining Data Streams

 

1.  Data Streams

  1)•Stream Management is important when the input rate is controlled externally  输入率由外部控制

  2)We can think of the data as infinite and nonstationary (the distribution changes over time) 数据是无限,且一直变化的

 

2. DBMS VS Data Stream 

  1) random access is expensive(on the disk) - single scan algorithm 

 

3. Sampling from a Data Stream

  1) 取10%(fixed proportion)的数据

    1) Naive Approach: 随机取百 分之十

    2)

  2)取fixed size的数据

 

以上是关于COMP9313 Week9a-0的主要内容,如果未能解决你的问题,请参考以下文章

COMP9313 Lab1 SPARK pyspark 安装

COMP9313_WEEK1_2_课程简介

COMP9313 Week 7 Product Quantization and K-Means Clustering

KeyError:“[Int64Index dtype='int64', length=9313)] 都不在 [columns] 中”

进入IE后,系统显示0*7c9313a0指令用的0*00000000000内存,该内存不能为written,这是怎么回事?怎么解决

COMP3004/COMP4105 问题讲解