提取网页的markdown表格利器

Posted nlper

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了提取网页的markdown表格利器相关的知识,希望对你有一定的参考价值。

在线Markdown表格转换器

  • markdown表格转换器,蛮好用的。偶然发现的开源工具,推荐一波。

    • 这是目标链接:https://docs.locust.io/en/sta...
    • 这是待提取table项

尝试1,直接通过html导入的功能

导入方法

import→URL→贴入复制的url→点击parse→往下拖动点击import data→把结果栏生成的结果copy到markdown
具体可见下面的GIF图嘞

得到结果如下

尝试2:通过源码导入

导入方法

首先点击目标网页,右键→检查→选择页面中的元素(选择到table对应的元素)→复制table对应的html内容
import→HTML→贴入上一步复制的→import data→把结果栏生成的结果copy到markdown
具体可见下面的GIF图嘞

得到结果如下

Command lineEnvironmentConfig fileDescription
-f, --locustfileLOCUST_LOCUSTFILElocustfilePython module file to import, e.g. ‘../other.py’. Default: locustfile
-H, --hostLOCUST_HOSThostHost to load test in the following format: http://10.21.32.33
-u, --usersLOCUST_USERSusersNumber of concurrent Locust users. Primarily used together with –headless. Can be changed during a test by inputs w, W(spawn 1, 10 users) and s, S(stop 1, 10 users)
-r, --spawn-rateLOCUST_SPAWN_RATEspawn-rateThe rate per second in which users are spawned. Primarily used together with –headless
--hatch-rateLOCUST_HATCH_RATEhatch-rate==SUPPRESS==
-t, --run-timeLOCUST_RUN_TIMErun-timeStop after the specified amount of time, e.g. (300s, 20m, 3h, 1h30m, etc.). Only used together with –headless. Defaults to run forever.
--web-hostLOCUST_WEB_HOSTweb-hostHost to bind the web interface to. Defaults to ‘*’ (all interfaces)
--web-port, -PLOCUST_WEB_PORTweb-portPort on which to run web host
--headlessLOCUST_HEADLESSheadlessDisable the web interface, and instead start the load test immediately. Requires -u and -t to be specified.
--headfulLOCUST_HEADFULheadful==SUPPRESS==
--web-authLOCUST_WEB_AUTHweb-authTurn on Basic Auth for the web interface. Should be supplied in the following format: username:password
--tls-certLOCUST_TLS_CERTtls-certOptional path to TLS certificate to use to serve over HTTPS
--tls-keyLOCUST_TLS_KEYtls-keyOptional path to TLS private key to use to serve over HTTPS
--masterLOCUST_MODE_MASTERmasterSet locust to run in distributed mode with this process as master
--master-bind-hostLOCUST_MASTER_BIND_HOSTmaster-bind-hostInterfaces (hostname, ip) that locust master should bind to. Only used when running with –master. Defaults to * (all available interfaces).
--master-bind-portLOCUST_MASTER_BIND_PORTmaster-bind-portPort that locust master should bind to. Only used when running with –master. Defaults to 5557.
--expect-workersLOCUST_EXPECT_WORKERSexpect-workersHow many workers master should expect to connect before starting the test (only when –headless used).
--workerLOCUST_MODE_WORKERworkerSet locust to run in distributed mode with this process as worker
--master-hostLOCUST_MASTER_NODE_HOSTmaster-hostHost or IP address of locust master for distributed load testing. Only used when running with –worker. Defaults to 127.0.0.1.
--master-portLOCUST_MASTER_NODE_PORTmaster-portThe port to connect to that is used by the locust master for distributed load testing. Only used when running with –worker. Defaults to 5557.
-T, --tagsLOCUST_TAGStagsList of tags to include in the test, so only tasks with any matching tags will be executed
-E, --exclude-tagsLOCUST_EXCLUDE_TAGSexclude-tagsList of tags to exclude from the test, so only tasks with no matching tags will be executed
--csvLOCUST_CSVcsvStore current request stats to files in CSV format. Setting this option will generate three files: [CSV_PREFIX]_stats.csv, [CSV_PREFIX]_stats_history.csv and [CSV_PREFIX]_failures.csv
--csv-full-historyLOCUST_CSV_FULL_HISTORYcsv-full-historyStore each stats entry in CSV format to _stats_history.csv file. You must also specify the ‘–csv’ argument to enable this.
--print-statsLOCUST_PRINT_STATSprint-statsPrint stats in the console
--only-summaryLOCUST_ONLY_SUMMARYonly-summaryOnly print the summary stats
--reset-statsLOCUST_RESET_STATSreset-statsReset statistics once spawning has been completed. Should be set on both master and workers when running in distributed mode
--htmlLOCUST_HTMLhtmlStore HTML report file
--skip-log-setupLOCUST_SKIP_LOG_SETUPskip-log-setupDisable Locust’s logging setup. Instead, the configuration is provided by the Locust test or Python defaults.
--loglevel, -LLOCUST_LOGLEVELloglevelChoose between DEBUG/INFO/WARNING/ERROR/CRITICAL. Default is INFO.
--logfileLOCUST_LOGFILElogfilePath to log file. If not set, log will go to stdout/stderr
--exit-code-on-errorLOCUST_EXIT_CODE_ON_ERRORexit-code-on-errorSets the process exit code to use when a test result contain any failure or error
-s, --stop-timeoutLOCUST_STOP_TIMEOUTstop-timeoutNumber of seconds to wait for a simulated user to complete any executing task before exiting. Default is to terminate immediately. This parameter only needs to be specified for the master process when running Locust distributed.

以上是关于提取网页的markdown表格利器的主要内容,如果未能解决你的问题,请参考以下文章

怎么把这个表格的内容提取到网页上面?

jableparser: 通用的网页正文+表格提取工具

jableparser: 通用的网页正文+表格提取工具

Markdown神器,程序员的利器哈哈

python提取网页表格并保存为csv

VS Code配置markdown代码片段