Javascript从多行字符串中删除前导和尾随空格
Posted
技术标签:
【中文标题】Javascript从多行字符串中删除前导和尾随空格【英文标题】:Javascript remove leading and trailing spaces from multiline string 【发布时间】:2016-10-11 18:54:32 【问题描述】:如何转换此文本
data=`ID ra dec V VR MJD
100 30.1 +15 7.00 -10 2450000.1234
200 30.2 +16 12.226 -5.124 2450000.2345
300 30.3 +17 13.022 12.777 2450000.3456
400 30.4 +18 14.880 13.666 2450000.6789
500 30.5 +19 12.892 -1.835 2450001
600 30.6 +20 17.587 15.340 2450002.123
700 30.7 +21 13.984 13.903 2450000.123456
800 30.8 +22 20.00 10.000 2450003.0 `
即导入的文本,其中包含由空格和制表符分隔的多行和多列,
ID,ra,dec,V,VR,MJD
100,30.1,+15,7.00,-10,2450000.1234
200,30.2,+16,12.226,-5.124,2450000.2345
300,30.3,+17,13.022,12.777,2450000.3456
400,30.4,+18,14.880,13.666,2450000.6789
500,30.5,+19,12.892,-1.835,2450001
600,30.6,+20,17.587,15.340,2450002.123
700,30.7,+21,13.984,13.903,2450000.123456
800,30.8,+22,20.00,10.000,2450003.0
不幸的是,
此正则表达式data=data.replace(/^\s+|\s+$/g,'').replace(/[\t \r]+/g,',');
仅适用于第一行,
这个data.replace(/[^\S\r\n]+$/gm, "").replace(/[\t \r]+/g,',');
没问题,但仅适用于 for traling。
额外:如何将其转换为json
,将两个块分成两个数据集,例如[[id:..., ra:...,,],[id:..., ra:...,,]]
【问题讨论】:
值或制表符之间是否有空格? 感谢您的快速评论。它们可以是一个或多个空格或制表符。 它们是否具有相同的含义(如分离)?什么是稀疏值? 不,它们只是分隔数据列的一种方式 您的“额外”是一个单独的问题。 【参考方案1】:使用 split/join 和 trim 可能更容易进行字符串转换:
data
.split(/\r?\n/)
.map(row => row.trim().split(/\s+/).join(','))
.join('\n')
额外的功劳涉及更多。 :)
const rows = data.split(/\r?\n/).map(row => row.trim().split(/\s+/).join(','));
const keys = rows.shift().split(',');
const chunks = rows.join("\n").split(/\n2,/);
const output = chunks .map(chunk => chunk.split("\n").map(
row => row.split(',').reduce((obj, v, i) =>
obj[keys[i]] = v;
return obj;
, )
));
【讨论】:
我应该通过 三个 换行符而不是两个换行符来创建chunks
。已更新。
嗯...这不是我所看到的。这是一个jsfiddle,输出到控制台。
哦,我明白了!不,我的解决方案是错误的。对于那个很抱歉。我已经更新了。这是一个新的fiddle。
是的,只需将chunk
拆分为:/\n2,/
。 Fiddle.
可能是错字?如果您查看最新的fiddle,它应该会生成array[2]
。【参考方案2】:
你快到了。您希望在第一次替换时使用多行标志,
但不要替换\n
,所以不要使用\s
。请改用[ \t]
:
var data = 'ID ra dec V VR MJD\n' +
' 100 30.1 +15 7.00 -10 2450000.1234\n' +
'200 30.2 +16 12.226 -5.124 2450000.2345\n' +
' 300 30.3 +17 13.022 12.777 2450000.3456\n' +
'\n' +
'\n' +
'400 30.4 +18 14.880 13.666 2450000.6789\n' +
'500 30.5 +19 12.892 -1.835 2450001\n' +
' 600 30.6 +20 17.587 15.340 2450002.123\n' +
'700 30.7 +21 13.984 13.903 2450000.123456\n' +
'800 30.8 +22 20.00 10.000 2450003.0 \n'
var result = data.replace(/^[ \t]+|[ \t]+$/gm,'').replace(/[ \t]+/g,',')
console.log(result);
【讨论】:
【参考方案3】:// First: the trimming part. Split on newlines, process
// each line by trimming it and replacing remaining white
// space with commas
var data = 'ID ra dec V VR MJD\n\
100 30.1 +15 7.00 -10 2450000.1234\n\
200 30.2 +16 12.226 -5.124 2450000.2345\n\
300 30.3 +17 13.022 12.777 2450000.3456\n\
\n\
\n\
400 30.4 +18 14.880 13.666 2450000.6789\n\
500 30.5 +19 12.892 -1.835 2450001\n\
600 30.6 +20 17.587 15.340 2450002.123\n\
700 30.7 +21 13.984 13.903 2450000.123456 \n\
800 30.8 +22 20.00 10.000 2450003.0 ';
data = data.split('\n');
var i = 0, l = data.length;
for ( ; i < l; i++)
data[i] = data[i].trim().replace(/\s+/g,',');
data = data.join('\n');
document.write('<h1>Formatted data string</h1><pre><code>'+data+'</code></pre>');
// Now to turn it into objects.
// We'll strip the first line because
// that'll be the list of column names:
var cols = data.replace(/^([^\n]+)\n/,'$1').split(','),
columnCount = cols.length;
data = data.replace(/^[^\n]+\n/,'');
// Now separate the 2 datasets
var datasets = data.split('\n\n\n');
document.write('<h1>First dataset</h1><pre><code>'+datasets[0]+'</code></pre>');
document.write('<h1>Second dataset</h1><pre><code>'+datasets[1]+'</code></pre>')
// Now we go through each line and
// place the values into objects which
// we'll push to an array
var processed = [];
i = 0;
l = datasets.length;
for ( ; i < l; i++)
processed[i] = [];
var lines = datasets[i].split('\n'),
lineCount = lines.length;
for (var j = 0; j < lineCount; j++)
var dataArray = lines [j].split(','),
obj = ;
for (var k = 0; k < columnCount; k++)
obj[cols[k]] = dataArray[k];
processed[i].push(obj);
var finalJSON = JSON.stringify(processed);
document.write('<h1>Final JSON</h1><pre><code>'+finalJSON+'</code></pre>');
【讨论】:
这是一个非常完整的答案。很奇怪,console.log(JSON.stringify(processed))
返回了正确的结果,而console.log(processed)
返回了一个由 3 组成的数组和由 5 个组成的数组,每个数组都有奇怪的 key:values 例如undefined : "2450000.3456↵↵↵400"
【参考方案4】:
因此,由于您知道每行的确切格式,因此您可以按每行使用捕获组来提取详细信息。试试这样的:
/^\s*(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s*$/mg
请记住,\s
匹配所有空格,而 \S
匹配非空格。如有必要,您可能需要根据自己的喜好调整捕获组。然后,使用多行和全局标志,我们都可以迭代所有匹配项。
代码如下:
// Your data, with the header removed, formatted as a string literal:
var data = "100 30.1 +15 7.00 -10 2450000.1234\n"+
"200 30.2 +16 12.226 -5.124 2450000.2345\n"+
" 300 30.3 +17 13.022 12.777 2450000.3456\n"+
"\n"+
"\n"+
"400 30.4 +18 14.880 13.666 2450000.6789\n"+
"500 30.5 +19 12.892 -1.835 2450001\n"+
" 600 30.6 +20 17.587 15.340 2450002.123\n"+
"700 30.7 +21 13.984 13.903 2450000.123456 \n"+
"800 30.8 +22 20.00 10.000 2450003.0";
// The pattern to grab the data:
var data_pattern = /^\s*(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s*$/mg;
// Keep matching until we run out of lines that match...
var results = [];
var line_match;
while ((line_match = data_pattern.exec(data)) !== null)
// Parse the match into a json object and add it to the results.
results.push(
ID: line_match[1],
ra: line_match[2],
dec: line_match[3],
V: line_match[4],
VR: line_match[5],
MJD: line_match[6]
);
// Output the results.
console.log(JSON.stringify(results, null, 2));
这是控制台上的结果:
[
"ID": "100",
"ra": "30.1",
"dec": "+15",
"V": "7.00",
"VR": "-10",
"MJD": "2450000.1234"
,
"ID": "200",
"ra": "30.2",
"dec": "+16",
"V": "12.226",
"VR": "-5.124",
"MJD": "2450000.2345"
,
"ID": "300",
"ra": "30.3",
"dec": "+17",
"V": "13.022",
"VR": "12.777",
"MJD": "2450000.3456"
,
"ID": "400",
"ra": "30.4",
"dec": "+18",
"V": "14.880",
"VR": "13.666",
"MJD": "2450000.6789"
,
"ID": "500",
"ra": "30.5",
"dec": "+19",
"V": "12.892",
"VR": "-1.835",
"MJD": "2450001"
,
"ID": "600",
"ra": "30.6",
"dec": "+20",
"V": "17.587",
"VR": "15.340",
"MJD": "2450002.123"
,
"ID": "700",
"ra": "30.7",
"dec": "+21",
"V": "13.984",
"VR": "13.903",
"MJD": "2450000.123456"
,
"ID": "800",
"ra": "30.8",
"dec": "+22",
"V": "20.00",
"VR": "10.000",
"MJD": "2450003.0"
]
我希望这会有所帮助。
【讨论】:
以上是关于Javascript从多行字符串中删除前导和尾随空格的主要内容,如果未能解决你的问题,请参考以下文章