Influx DB first explore

Posted 2021-03-20

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Influx DB first explore相关的知识，希望对你有一定的参考价值。

Key characters:

? ?Store time series data: ?

Data volume is large
High number of concurrent reads/writes
Optimized for (C)reate, (R)ead: ???TSM Tree
Limited support for (U)pdate, (D)elet ?

Serial data is more important than single data point:

No traditional id, data identified by series and timestamp ??
Good support for aggregation

Schemaless

New tags can be added to new data record on demand

Key concept:

Measurement:

Similar to SQL table

Field:

? ? ? Searchable (full scan)
? ? ? Not indexed
? ? ? Value can be string, float, integer, boolean
? ? ? Can apply aggregation function
? ? ? At least one field in measurement (table)
? ? ? Support aggregation functions

Retention policy:

Two key parts:
Duration: how long data is kept ?
Replication: replication factor
One measurement can have many retention policies ?
Default retention policy: duration = infinite and replication factor = 1
Can be altered at runtime

Series:

The collection of data in the InfluxDB data structure that share a measurement, tag set, and retention policy. (not including fields)

Shard:

Mapped to a TSM file
Belongs to a single shard group
A shard group can have multiple shards, each belongs to different series
A shard group has shard duration, determine the duration of data span
Default shard group duration can be deduced from retention policy duration

Special Features:

Continuous query:

Recommendation:

Adding data in time ascending order is significantly more performant.
Limit tag cardinality for optimal memory usage
Use SSD for storage??
https://docs.influxdata.com/influxdb/v1.7/guides/hardware_sizing/#general-hardware-guidelines-for-a-single-node
Use long shard duration for better query performance
Use short shard duration for more efficient data deletion
Shard groups should be twice as long as the longest time range of the most frequent queries
Shard groups should each contain more than 100,000?points?per shard group
Shard groups should each contain more than 1,000 points per?series
When writing historical data, it is recommended to temporarily setting a longer shard group duration?to avoid lots of shard

Limitations:

Unable to store duplicated data.
Query might not see most recently written data ?(trade off consistency for performance)
Only able to update field value with same measure + tag set + timestamp
Unable to update tag value
Unable to delete tag value
Cannot delete a single point (workaround: delete base on time), can only drop/delete series
No join table is supported

Installation:

version: ‘3.3‘

services:
influxdb:
image: ‘influxdb:1.7.8-alpine‘
environment:
INFLUXDB_DB: monitordb
INFLUXDB_HTTP_AUTH_ENABLED: ‘true‘
INFLUXDB_ADMIN_USER: admin
INFLUXDB_ADMIN_PASSWORD: password
INFLUXDB_USER: influxuser
INFLUXDB_USER_PASSWORD: password

ports:
- "8086:8086"
- "8088:8088"

volumes:
- /opt/volumes/influxdb/data:/var/lib/influxdb

networks:
layer0:
aliases:
- ‘influxdb‘

labels:
- "dev.description=transaction monitoring influx db"

networks:
layer0:

Commandline:??
./usr/bin/influx -username ‘admin‘ -password ‘password‘

Issues:

If data point‘s time is before retention policy‘s valid time range, inserting it will show error:? "partial write: points beyond retention policy dropped=1"

If data point is inserted with retention policy, querying the data should also set retention policy. Eg? "select * from ninetyday.test1", otherwise, no record will be found?

Batch processing uses a separate thread pool and data is send to server if data is accumulated until flush duration (1s) or buffer limit exceeded (10000 records).? ?Need to call InfluxDB.close after batch processing to ensure proper resource reclamation?

max-values-per-tag limit exceeded (100000/100000):? ?/etc/influxdb/influxdb.conf? max-values-per-tag = 0? ? or docker environment variable:??INFLUXDB_DATA_MAX_VALUES_PER_TAG: 0

Performance:?

Insertion:?

100000 records with sequential timestamp takes about 4 seconds
100000 records with random timestamp takes about 4.4 seconds

Query:

select * from ninetyday.test1 where time < now() limit 100? takes about 1.1 seconds
select count(eventtype) from ninetyday.test1 where time < now()? takes about 1.2 seconds
select count(eventtype) from ninetyday.test1 where time < now() group by userId? takes about 1.5 seconds
select * from ninetyday.test1 where time < now() and transactionId = ‘transaction1‘? takes about 106 ms
select * from ninetyday.test1 where time < now() and userId = ‘user1‘? ? takes about 130ms

Insertion:

? ? 1000000 records with random timestamp takes about 37 seconds

Query:?

select * from ninetyday.test1 where time < now() limit 100? takes about 8.3 seconds
select count(eventtype) from ninetyday.test1 where time < now()? read timeout
select count(eventtype) from ninetyday.test1 where time < now() group by userId? read timeout
select * from ninetyday.test1 where time < now() and transactionId = ‘transaction1‘? takes about 70 ms
select * from ninetyday.test1 where time < now() and userId = ‘user1‘? ?takes about 130ms

以上是关于Influx DB first explore的主要内容，如果未能解决你的问题，请参考以下文章

Grafana/Influx Db 对数据源的身份验证失败

db first和code first

时序数据库Influx-IOx源码学习十二（物理计划的执行）

ef code first db first 哪种好

为 EF-db- first 设置 mvc-mini-profiler

Android Studio 4.2.2 Device File Explorer的data中如何查看db文件