C 和 C++ 中的静态变量存储在哪里？

Posted 2023-02-19

技术标签:

【中文标题】C 和 C++ 中的静态变量存储在哪里？【英文标题】：Where are static variables stored in C and C++? 【发布时间】：2010-09-10 17:15:07 【问题描述】：

在可执行文件的哪个段（.BSS、.DATA、其他）中存储了静态变量，这样它们就不会发生名称冲突？例如：


foo.c:                         bar.c:
static int foo = 1;            static int foo = 10;
void fooTest()                void barTest() 
  static int bar = 2;            static int bar = 20;
  foo++;                         foo++;
  bar++;                         bar++;
  printf("%d,%d", foo, bar);     printf("%d, %d", foo, bar);

如果我编译这两个文件并将其链接到一个重复调用 fooTest() 和 barTest 的 main，则 printf 语句会独立递增。有意义，因为 foo 和 bar 变量是翻译单元的本地变量。

但是存储分配在哪里？

明确地说，假设您有一个工具链可以输出 ELF 格式的文件。因此，我相信必须在可执行文件中为这些静态变量保留一些空间。出于讨论目的，假设我们使用 GCC 工具链。

【问题讨论】：

大多数人都在告诉您，它们应该存储在 .DATA 部分中，而不是回答您的问题：.DATA 部分的确切位置以及如何找到位置。我看到你已经标记了一个答案，所以你已经知道如何找到它了？为什么初始化和未初始化放在不同的部分：linuxjournal.com/article/1059 在运行时分配给全局/静态变量的存储与它们的名称解析无关，这发生在构建/链接时。构建可执行文件后 - 不再有名称。这个问题毫无意义，建立在未导出符号的“名称冲突”是可以存在的事情的错误前提之上。没有合理问题的事实可能解释了一些答案是多么可怕。很难相信这么少的人得到这个。 【参考方案1】：

静态变量的去向取决于它们是否零初始化。 零初始化静态数据进入.BSS (Block Started by Symbol)，非零初始化数据进入.DATA

【讨论】：

“非 0 初始化”可能是指“已初始化，但不是 0”。因为在 C/C++ 中没有“非初始化”静态数据之类的东西。默认情况下，所有静态内容都是零初始化的。 @Don Neufeld：你的回答根本没有回答这个问题。我不明白为什么它被接受。因为 'foo' 和 'bar' 都是非 0 初始化的。问题是在 .bss 或 .data 中放置两个同名的静态/全局变量的位置我使用了这样的实现，其中显式零初始化的静态数据进入.data，而没有初始化器的静态数据进入.bss。 @M.M 在我的情况下，静态成员是未初始化（隐式初始化为 0 ）还是显式初始化为 0，在这两种情况下它都在 .bss 部分中添加。此信息是否特定于某个可执行文件类型？由于您没有指定，我假设它至少适用于 ELF 和 Windows PE 可执行文件，但其他类型呢？【参考方案2】：

当一个程序被加载到内存中时，它被组织成不同的段。其中一个段是DATA 段。数据段进一步细分为两部分：

初始化数据段： 所有的全局、静态和常量数据都存放在这里。 未初始化的数据段（BSS）： 所有未初始化的数据都存储在该段中。

这是解释这个概念的图表：

这里有一个很好的链接来解释这些概念：Memory Management in C: The Heap and the Stack

【讨论】：

上面的答案说0初始化进入BSS。 0 初始化是否意味着未初始化或 0 本身？如果它本身意味着 0，那么我认为您应该将其包含在您的答案中。常量数据不存储在 .data 段中，而是存储在文本部分的 .const 段中。另请注意，据我了解，“初始化数据”可以由初始化的变量和常量组成.在微控制器（例如：STM32）上，初始化变量默认存储在Flash内存中并在启动时复制到RAM，并且已初始化的常量 与包含程序本身的文本一起保留在 Flash only 中并打算从中读取，该文本包含程序本身并保留在 Flash i>仅限 Flash。 链接坏了:( +1 for @GabrielStaples 强调初始化数据可以进一步分为只读（=> .rodata 部分）和读写（=> .data 部分）。【参考方案3】：

其实一个变量就是元组（存储、作用域、类型、地址、值）：

storage     :   where is it stored, for example data, stack, heap...
scope       :   who can see us, for example global, local...
type        :   what is our type, for example int, int*...
address     :   where are we located
value       :   what is our value

本地范围可能意味着翻译单元（源文件）、函数或块的本地范围，具体取决于其定义的位置。要使变量对多个函数可见，它肯定必须在 DATA 或 BSS 区域中（分别取决于它是否显式初始化）。然后将其范围相应地限定为源文件中的所有函数或函数。

【讨论】：

+1 用于在高级别的彻底分类。如果您还可以指出此信息的来源，那就太好了。【参考方案4】：

数据的存储位置将取决于实现。

但是，静态的意思是“内部链接”。因此，该符号对于编译单元（foo.c、bar.c）是内部，并且不能在该编译单元之外被引用。因此，不会有名称冲突。

【讨论】：

没有。 static keyworld 有重载的含义：在这种情况下 static 是存储修饰符，而不是链接修饰符。 ugasoft：函数外的静态是链接修饰符，里面是存储修饰符，一开始就不会发生碰撞。【参考方案5】：

在“全局和静态”区域:)

C++中有几个内存区域：

堆免费商店堆栈全局和静态常量

请参阅here 了解您问题的详细答案：

以下总结了 C++ 程序的主要不同内存区域。请注意，某些名称（例如，“堆”）在 [标准] 草案中并未如此显示。

     Memory Area     Characteristics and Object Lifetimes
     --------------  ------------------------------------------------

     Const Data      The const data area stores string literals and
                     other data whose values are known at compile
                     time.  No objects of class type can exist in
                     this area.  All data in this area is available
                     during the entire lifetime of the program.

                     Further, all of this data is read-only, and the
                     results of trying to modify it are undefined.
                     This is in part because even the underlying
                     storage format is subject to arbitrary
                     optimization by the implementation.  For
                     example, a particular compiler may store string
                     literals in overlapping objects if it wants to.


     Stack           The stack stores automatic variables. Typically
                     allocation is much faster than for dynamic
                     storage (heap or free store) because a memory
                     allocation involves only pointer increment
                     rather than more complex management.  Objects
                     are constructed immediately after memory is
                     allocated and destroyed immediately before
                     memory is deallocated, so there is no
                     opportunity for programmers to directly
                     manipulate allocated but uninitialized stack
                     space (barring willful tampering using explicit
                     dtors and placement new).


     Free Store      The free store is one of the two dynamic memory
                     areas, allocated/freed by new/delete.  Object
                     lifetime can be less than the time the storage
                     is allocated; that is, free store objects can
                     have memory allocated without being immediately
                     initialized, and can be destroyed without the
                     memory being immediately deallocated.  During
                     the period when the storage is allocated but
                     outside the object's lifetime, the storage may
                     be accessed and manipulated through a void* but
                     none of the proto-object's nonstatic members or
                     member functions may be accessed, have their
                     addresses taken, or be otherwise manipulated.


     Heap            The heap is the other dynamic memory area,
                     allocated/freed by malloc/free and their
                     variants.  Note that while the default global
                     new and delete might be implemented in terms of
                     malloc and free by a particular compiler, the
                     heap is not the same as free store and memory
                     allocated in one area cannot be safely
                     deallocated in the other. Memory allocated from
                     the heap can be used for objects of class type
                     by placement-new construction and explicit
                     destruction.  If so used, the notes about free
                     store object lifetime apply similarly here.


     Global/Static   Global or static variables and objects have
                     their storage allocated at program startup, but
                     may not be initialized until after the program
                     has begun executing.  For instance, a static
                     variable in a function is initialized only the
                     first time program execution passes through its
                     definition.  The order of initialization of
                     global variables across translation units is not
                     defined, and special care is needed to manage
                     dependencies between global objects (including
                     class statics).  As always, uninitialized proto-
                     objects' storage may be accessed and manipulated
                     through a void* but no nonstatic members or
                     member functions may be used or referenced
                     outside the object's actual lifetime.

【讨论】：

【参考方案6】：

如何通过objdump -Sr自己找到它

要真正了解发生了什么，您必须了解链接器重定位。如果您从未接触过，请考虑reading this post first。

让我们分析一个 Linux x86-64 ELF 示例来自己看看：

#include <stdio.h>

int f() 
    static int i = 1;
    i++;
    return i;


int main() 
    printf("%d\n", f());
    printf("%d\n", f());
    return 0;

编译：

gcc -ggdb -c main.c

反编译代码：

objdump -Sr main.o

-S 反编译代码与原始源代码混合 -r显示搬迁信息

在f的反编译里面我们看到：

 static int i = 1;
 i++;
4:  8b 05 00 00 00 00       mov    0x0(%rip),%eax        # a <f+0xa>
        6: R_X86_64_PC32    .data-0x4

而.data-0x4 表示它将转到.data 段的第一个字节。

-0x4 存在是因为我们使用的是 RIP 相对寻址，因此指令中的 %rip 和 R_X86_64_PC32。

这是必需的，因为 RIP 指向以下指令，该指令在 00 00 00 00 之后的 4 个字节处开始，这将被重定位。我在https://***.com/a/30515926/895245

上对此进行了更详细的解释

那么，如果我们将源代码修改为i = 1，并做同样的分析，我们得出的结论是：

static int i = 0 继续 .bss static int i = 1 继续 .data

【讨论】：

【参考方案7】：

我不相信会有碰撞。在文件级别（外部函数）使用 static 将变量标记为当前编译单元（文件）的本地变量。它在当前文件之外永远不可见，因此不必有一个可以在外部使用的名称。

使用静态inside函数是不同的 - 变量只对函数可见（无论是否静态），它只是在调用该函数时保留它的值。

实际上，静态根据它的位置做了两件不同的事情。然而，在这两种情况中，变量的可见性受到限制，因此您可以在链接时轻松防止命名空间冲突。

话虽如此，我相信它将存储在DATA 部分中，该部分往往具有初始化为非零值的变量。当然，这是一个实现细节，而不是标准强制要求的东西——它只关心行为，而不关心事情是如何在幕后完成的。

【讨论】：

@paxdiablo：您提到了两种类型的静态变量。这篇文章（en.wikipedia.org/wiki/Data_segment）指的是哪一个？数据段还保存全局变量（本质上与静态变量完全相反）。

So, how does a segment of memory (Data Segment) store variables that can be accessed from everywhere (global variables) and also those which have limited scope (file scope or function scope in case of static variables)?

@eSKay，这与可见性有关。可以有一些东西存储在编译单元本地的段中，其他的东西是完全可访问的。一个例子：考虑每个comp-unit 为DATA 段贡献一个块。它知道 everything 在该块中的位置。它还发布块中它希望其他组合单元能够访问的那些东西的地址。链接器可以在链接时解析这些地址。【参考方案8】：

这是怎么做的（容易理解）：

【讨论】：

【参考方案9】：

这取决于您使用的平台和编译器。一些编译器直接存储在代码段中。静态变量始终只能由当前翻译单元访问，并且名称不会被导出，因此不会发生名称冲突。

【讨论】：

【参考方案10】：

在编译单元中声明的数据将进入 .BSS 或该文件输出的 .Data。 BSS 中初始化数据，DATA 中未初始化。

静态数据和全局数据的区别在于文件中包含符号信息。编译器倾向于包含符号信息，但只标记全局信息。

链接器尊重此信息。静态变量的符号信息要么被丢弃，要么被破坏，以便仍然可以以某种方式引用静态变量（使用调试或符号选项）。在这两种情况下，编译单元都不会受到影响，因为链接器首先解析本地引用。

【讨论】：

-1 用于不准确的评论 - 未初始化的数据不会进入 DATA。未初始化和零初始化的数据进入 BSS 部分。【参考方案11】：

我用 objdump 和 gdb 试了一下，结果如下：

(gdb) disas fooTest
Dump of assembler code for function fooTest:
   0x000000000040052d <+0>: push   %rbp
   0x000000000040052e <+1>: mov    %rsp,%rbp
   0x0000000000400531 <+4>: mov    0x200b09(%rip),%eax        # 0x601040 <foo>
   0x0000000000400537 <+10>:    add    $0x1,%eax
   0x000000000040053a <+13>:    mov    %eax,0x200b00(%rip)        # 0x601040 <foo>
   0x0000000000400540 <+19>:    mov    0x200afe(%rip),%eax        # 0x601044 <bar.2180>
   0x0000000000400546 <+25>:    add    $0x1,%eax
   0x0000000000400549 <+28>:    mov    %eax,0x200af5(%rip)        # 0x601044 <bar.2180>
   0x000000000040054f <+34>:    mov    0x200aef(%rip),%edx        # 0x601044 <bar.2180>
   0x0000000000400555 <+40>:    mov    0x200ae5(%rip),%eax        # 0x601040 <foo>
   0x000000000040055b <+46>:    mov    %eax,%esi
   0x000000000040055d <+48>:    mov    $0x400654,%edi
   0x0000000000400562 <+53>:    mov    $0x0,%eax
   0x0000000000400567 <+58>:    callq  0x400410 <printf@plt>
   0x000000000040056c <+63>:    pop    %rbp
   0x000000000040056d <+64>:    retq   
End of assembler dump.

(gdb) disas barTest
Dump of assembler code for function barTest:
   0x000000000040056e <+0>: push   %rbp
   0x000000000040056f <+1>: mov    %rsp,%rbp
   0x0000000000400572 <+4>: mov    0x200ad0(%rip),%eax        # 0x601048 <foo>
   0x0000000000400578 <+10>:    add    $0x1,%eax
   0x000000000040057b <+13>:    mov    %eax,0x200ac7(%rip)        # 0x601048 <foo>
   0x0000000000400581 <+19>:    mov    0x200ac5(%rip),%eax        # 0x60104c <bar.2180>
   0x0000000000400587 <+25>:    add    $0x1,%eax
   0x000000000040058a <+28>:    mov    %eax,0x200abc(%rip)        # 0x60104c <bar.2180>
   0x0000000000400590 <+34>:    mov    0x200ab6(%rip),%edx        # 0x60104c <bar.2180>
   0x0000000000400596 <+40>:    mov    0x200aac(%rip),%eax        # 0x601048 <foo>
   0x000000000040059c <+46>:    mov    %eax,%esi
   0x000000000040059e <+48>:    mov    $0x40065c,%edi
   0x00000000004005a3 <+53>:    mov    $0x0,%eax
   0x00000000004005a8 <+58>:    callq  0x400410 <printf@plt>
   0x00000000004005ad <+63>:    pop    %rbp
   0x00000000004005ae <+64>:    retq   
End of assembler dump.

这是 objdump 结果

Disassembly of section .data:

0000000000601030 <__data_start>:
    ...

0000000000601038 <__dso_handle>:
    ...

0000000000601040 <foo>:
  601040:   01 00                   add    %eax,(%rax)
    ...

0000000000601044 <bar.2180>:
  601044:   02 00                   add    (%rax),%al
    ...

0000000000601048 <foo>:
  601048:   0a 00                   or     (%rax),%al
    ...

000000000060104c <bar.2180>:
  60104c:   14 00                   adc    $0x0,%al

也就是说，你的四个变量位于数据段事件中，同名，但偏移量不同。

【讨论】：

远不止这些。即使现有的答案也不完整。仅提及其他内容：线程本地人。【参考方案12】：

如前所述，存储在数据段或代码段中的静态变量。您可以确定它不会分配在堆栈或堆上。由于static 关键字将变量的范围定义为文件或函数，因此没有发生冲突的风险，如果发生冲突，有一个编译器/链接器会警告您。一个不错的example

【讨论】：

【参考方案13】：

答案很可能取决于编译器，因此您可能想要编辑您的问题（我的意思是，即使是段的概念也不是 ISO C 或 ISO C++ 强制要求的）。例如，在 Windows 上，可执行文件不带有符号名称。一个“foo”的偏移量为 0x100，另一个可能为 0x2B0，两个翻译单元的代码在编译时都知道“他们的”foo 的偏移量。

【讨论】：

【参考方案14】：

这个问题有点太老了，但是因为没有人指出任何有用的信息：检查'mohit12379'的帖子，解释符号表中同名静态变量的存储： http://www.geekinterview.com/question_details/24745

【讨论】：

【参考方案15】：

它们都将独立存储，但是如果您想让其他开发人员清楚，您可能希望将它们包装在命名空间中。

【讨论】：

【参考方案16】：

你已经知道它要么存储在 bss(block start by symbol) 中，也称为未初始化数据段，要么存储在已初始化数据段中。

举个简单的例子

void main(void)

static int i;

上面的静态变量没有初始化，所以转到未初始化的数据段（bss）。

void main(void)

static int i=10;

当然它初始化为 10，所以它进入初始化的数据段。

【讨论】：

以上是关于C 和 C++ 中的静态变量存储在哪里？的主要内容，如果未能解决你的问题，请参考以下文章