matlab中textscan如何实现包含空格的格式读取?

Posted 2023-05-11

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了matlab中textscan如何实现包含空格的格式读取?相关的知识，希望对你有一定的参考价值。

希望textscan能实现如下功能：
*********************
读取字符串：str=' 1985 112 -10.53'，输出结果为 1985,01,12,-10.53
尝试：
>>A=textscan(str,'%5u%2u%2u%7.2f')
A=[1985][11][2][-10.53]
但是结果不对，从结果分析，matlab把空格作为分隔符处理，但是此处希望空格作为0处理~~修改Delimiter、Whitespace选项也不能得出正确结果。。

PS：真正用途是想用textscan读取数据文件，但数据文件包含空格，比如上面str的真实含义是：'+19850112-010.53'.

textscan的用法
用法 1 ： C = textscan(fid, 'format', N, 'param', value)
用法 2 ： C = textscan(str, 'format', N, 'param', value)
注意是两种不同的情况，一个是文件即fid，另外一个是string

首先是string,
例 str = '0.41 8.24 3.57 6.24 9.27';
c = textscan(str,'%3.1f');
c1,1
ans =
0.4000
1.0000
8.2000
4.0000
3.5000
7.0000
6.2000
4.0000
9.2000
7.0000
"%3.1f表示"每次读3个字符，小数点后
C = textscan(str, '%3.1f %*1d'); 结果 C1 = [0.4; 8.2; 3.5; 6.2; 9.2]
C = textscan(str, '%3.1f %*1u'); 结果 C1 = [0.4; 8.2; 3.5; 6.2; 9.2]
C = textscan(str, '%3.1f'); 结果 C1 = [0.4; 1.0；8.2; 4.0；3.5; 7.0；6.2; 4.0； 9.2；7.0 ]
C = textscan(str, '%2.1f %*1u'); 结果 C1 = [0 1.0000 0.2000 3.0000 7.0000 0.2000 9.0000 7.0000]
C = textscan(str, '%2.1f %1u'); 注意结果包含两组 C1 = [0 1.0000 0.2000 3.0000 7.0000 0.2000 9.0000 7.0000]
C2 = [4 8 4 5 6 4 2]
例2 读取不同类型数据
生成文件'scan1.dat'，文件内容如下：
09/12/2005 Level1 12.34 45 1.23e10 inf Nan Yes 5.1+3i
10/12/2005 Level2 23.54 60 9e19 -inf 0.001 No 2.2-.5i
11/12/2005 Level3 34.90 12 2e5 10 100 No 3.1+.1i

fid = fopen('scan1.dat');
C = textscan(fid, '%s %s �2 � %u %f %f %s %f');
fclose(fid);
注意：每输入一个“%s”或者其他“�2”等，产生的C会多一组
C1 = '09/12/2005'; '10/12/2005'; '11/12/2005' class cell
C2 = 'Level1'; 'Level2'; 'Level3' class cell
C3 = [12.34; 23.54; 34.9] class single
C4 = [45; 60; 12] class int8
C5 = [4294967295; 4294967295; 200000] class uint32
注意：文件中的9e19或者1.23e10，要远远大于%u的范围，%u是整数，最大值为4294967295
C6 = [Inf; -Inf; 10] class double
C7 = [NaN; 0.001; 100] class double
C8 = 'Yes'; 'No'; 'No' class cell
C9 = [5.1+3.0i; 2.2-0.5i; 3.1+0.1i] class double
例3 空缺值赋值
生成文件data2.csv，文件内容如下：

abc, 2, NA, 3, 4
// Comment Here
def, na, 5, 6, 7

fid = fopen('data2.csv');
C = textscan(fid, '%s %n %n %n %n', 'delimiter', ',', ...
'treatAsEmpty', 'NA', 'na', ...
'commentStyle', '//');
fclose(fid);
因为存在5个类似"%s"的输出，所以C有5组
C1 = 'abc'; 'def'
C2 = [2; NaN]
C3 = [NaN; 5]
C4 = [3; 6]
C5 = [4; 7]

注意：假如使用textscan读fid，即某个文件，则每textscan一次，fid会往后推，即下一次textscan会在上一次textscan后的位置开始，而对string进行textscan，则每次textscan都是从第一个字母开始读取，假如想每次读string不从开头开始，则需要使用两个输出变量控制。
例4 lyric = 'Blackbird singing in the dead of night'
[firstword, pos] = textscan(lyric,'�', 1);
lastpart = textscan(lyric(pos+1:end), '%s');

注意以下两种区别：
lyric = 'Blackbird singing in the dead of night'
[firstword, pos] = textscan(lyric,'�', 2);
firstword1结果为“Blackbird
singing i”
[firstword, pos] = textscan(lyric,'�', 2);
firstword1结果为“Blackbir
d singing”
"�"时，读完9个字符，刚好遇到空格，所以读下一个9个字符，直接从s读取，但是“�”时，从d直接读，后面的空格也作为字符读取了。

lastpart = textscan(lyric(pos+1:end), '%s');

注意：假如文件data.txt内数据如下：
1,1,null,2,2
1,2,2,null,2
读取过程如下：
fid = fopen('data.txt','r');
C = textscan(fid, '%n','delimiter',',','treatAsEmpty','null','HeaderLines',1);
fclose( fid ); clear fid ans
结果如下：
C1(1:10) = [1 1 NaN 2 2 1 2 2 NaN 2]
但是如果命令如下：
C = textscan(fid, '%n','delimiter',',','treatAsEmpty','null','HeaderLines',1);
" %n "变为 " %u " 或者 " %d "，则上文结果中的NaN变为0

例
Using a text editor, create a file grades.txt that contains
Student_ID | Test1 | Test2 | Test3
1 91.5 89.2 77.3
2 88.0 67.8 91.0
3 76.3 78.1 92.5
4 96.4 81.2 84.6

C_text = textscan(fid, '%s', 4, 'delimiter', '|');

C_data1 = textscan(fid, '%d %f %f %f', ...
'CollectOutput', 1)
注意用'collectOutput'时，相同属性数据放在一起，例如' %d单独放一列，而其余的3个%f放在一起' 参考技术A

你的意思我还是不太懂，我大概理解一下是不是这样的：

%% 扫描字符串2
clear
clc
str = '1985 112 -10.53';
%将替换为0
A = find(str == 32);
str(A) = 48;
%下面这这一句相当于+198501120-10.53
%不是你给的+19850112-010.53
%第二个空格在负号前面，你怎么第一个位置对，第二个往后跑了一个
str2num(str)

另外我再给你一个textscan扫描字符串的例子，看对你有没有帮助：

%% 使用textscan扫描字符串中的数据
clc
str_1 = 'The number is 1 2 3 4 5';
%首先使用textscan获取第一个前14个字符
[str1,position1] = textscan(str_1,'%14c',1);
str1:; %The number is
position1; %14
%获取字符串的长度
[temp1,temp2] = size(str_1);
%然后读取后面的数字字符串
str_2 = textscan(str_1(position1+1:temp2),'%9c',1);
%将字符串转化为数值
num = str2num(str_21)

追问

谢谢答复，第一种做法我也考虑过，但是正如您指出的，它无法自动识别为-010.53. (我要读取的数据格式就是这样的)。我想使用textscan是因为我看重了它的速度，所以暂时没有考虑更多次的替换功能。不过受您的启发，我计划试试先全部读取为字符串再替换的方式，测试一下速度。。
再次感谢~

本回答被提问者采纳

matlAB如何保存变量

参考技术A

用save可以保存变量。

save data1 表示保存工作空间所有的变量到data1.mat中

save data2 m 表将工作空间的m变量保存到data2.mat中，如果工作空间没有m会报错。

save data3 m n p v 表示将工作空间中的 m n p v 四个变量保存到data3.mat中，如果工作空间没有这四个变量中的一个也会报错。

扩展资料：

变量命名规则

变量名必须以字母或下划线 "_" 开头。

变量名只能包含字母数字字符以及下划线。

变量名不能包含空格。如果变量名由多个单词组成，那么应该使用下划线进行分隔（比如 $my_string），或者以大写字母开头（比如 $myString）。

变量是一种使用方便的占位符，用于引用计算机内存地址，该地址可以存储Script运行时可更改的程序信息。

参考资料来源：百度百科-变量（计算机名词）

以上是关于matlab中textscan如何实现包含空格的格式读取?的主要内容，如果未能解决你的问题，请参考以下文章