数据库 vs 数据集市 vs 数据仓库 vs 数据湖
Posted
技术标签:
【中文标题】数据库 vs 数据集市 vs 数据仓库 vs 数据湖【英文标题】:Database vs DataMart vs Data Warehouse vs Data Lake 【发布时间】:2020-05-12 12:23:31 【问题描述】:寻找高层之间的差异/比较
数据库 数据集市(自上而下的方法) 数据仓库 数据湖如果没有具体情况,请使用相对比较。
【问题讨论】:
【参考方案1】:下面是提到的各种数据层之间的高级比较。如果其中任何一个需要更正,请随时发表评论。
注意:执行html查看结果
#dataTierComparison
font-family: "Trebuchet MS", Arial, Helvetica, sans-serif;
border-collapse: collapse;
width: 100%;
#dataTierComparison td,
#dataTierComparison th
border: 1px solid #ddd;
padding: 8px;
#dataTierComparison tr:nth-child(even)
background-color: #f2f2f2;
#dataTierComparison tr:hover
background-color: #ddd;
#dataTierComparison th
padding-top: 12px;
padding-bottom: 12px;
text-align: left;
background-color: #4CAF50;
color: white;
<table id="dataTierComparison">
<tbody>
<tr>
<th> </th>
<th>Database</th>
<th>Data Mart (Top-down)</th>
<th>Data Warehouse</th>
<th>Data Lake</th>
</tr>
<tr>
<th>Source</th>
<td>Single</td>
<td>Single</td>
<td>Multiple</td>
<td>Multiple</td>
</tr>
<tr>
<th>Structure</th>
<td>Structured</td>
<td>Structured</td>
<td>Structured</td>
<td>Raw</td>
</tr>
<tr>
<th>Purpose</th>
<td>Determined</td>
<td>Determined</td>
<td>Determined</td>
<td>Undertermined</td>
</tr>
<tr>
<th>Storage</th>
<td>Centralized</td>
<td>Decentralized</td>
<td>Centralized</td>
<td>Centralized</td>
</tr>
<tr>
<th>Data Format</th>
<td>Detailed</td>
<td>Summarized</td>
<td>Detailed</td>
<td>All</td>
</tr>
<tr>
<th>Flexibility</th>
<td>Low</td>
<td>Medium</td>
<td>Medium</td>
<td>High</td>
</tr>
<tr>
<th>Primary Use</th>
<td>Transactional</td>
<td>Reporting</td>
<td>Analytics & Reporting</td>
<td>Analytics</td>
</tr>
<tr>
<th>Cost</th>
<td>Low</td>
<td>Medium</td>
<td>Medium</td>
<td>High</td>
</tr>
<tr>
<th>Data Volume</th>
<td>Low</td>
<td>Low</td>
<td>Medium</td>
<td>High</td>
</tr>
<tr>
<th>Development</th>
<td>Top-down</td>
<td>Bottom-up</td>
<td>Top-down</td>
<td>All</td>
</tr>
<tr>
<th>Design Time</th>
<td>Medium</td>
<td>Medium</td>
<td>High</td>
<td>Low</td>
</tr>
<tr>
<th>Volatility</th>
<td>Medium</td>
<td>Low</td>
<td>None</td>
<td>None</td>
</tr>
<tr>
<th>Data Operations</th>
<td>CRUD</td>
<td>CR</td>
<td>CRU</td>
<td>CR</td>
</tr>
<tr>
<th>Subject Area</th>
<td>Single</td>
<td>Single</td>
<td>Multiple</td>
<td>Multiple</td>
</tr>
<tr>
<th>Design Schema</th>
<td>Relational</td>
<td>Multi-dimensional</td>
<td>Relational</td>
<td>No Schema</td>
</tr>
</tbody>
</table>
【讨论】:
以上是关于数据库 vs 数据集市 vs 数据仓库 vs 数据湖的主要内容,如果未能解决你的问题,请参考以下文章
Datamart vs.reporting Cube,有啥区别?