问渠网-C语言深处-第一节函数调用原理

本文作者：李德强

第一节函数调用原理

本章内容需要用到汇编的相关知识。

一、函数调用与返回

在第一章中我们学习了基础变量在内存中的存储方式，在本节我们再来看一下函数在内存中的存储方式。首先我们来编写一个非常简单程序，其中没有任何功能，只有2个函数main()和normal_function()：

void normal_function(void)
{

}

int main(int argc, char **args)
{
        normal_function();
        return 0;
}

Click and drag to move

这两个函数之间并没有调用、返回关系，事实上我们暂时不去理会main()函数，而只关心normal_function()的函数。这个函数的返回值为void，参数为void也就是说它即没有传入参数也没有返回值。下面来看一下这个程序的反汇编代码。首先使用gcc对main.c做编译，生成main可执行文件，再使用objdump命令对其做反汇编：

gcc -m32 main.c -o main
objdump -S -D -m i386 -M att main > main.S

Click and drag to move

反汇编的结果通过被重定位到main.S文件中，这个文件内容很多，但我们只关心main函数和normal_function函数的相关部分：

080483eb <normal_function>:
 80483eb:        55                           push   %ebp
 80483ec:        89 e5                        mov    %esp,%ebp
 80483ee:        90                           nop
 80483ef:        5d                           pop    %ebp
 80483f0:        c3                           ret    

080483f1 <main>:
 80483f1:        55                           push   %ebp
 80483f2:        89 e5                        mov    %esp,%ebp
 80483f4:        e8 f2 ff ff ff               call   80483eb <normal_function>
 80483f9:        b8 00 00 00 00               mov    $0x0,%eax
 80483fe:        5d                           pop    %ebp
 80483ff:        c3                           ret

Click and drag to move

在main函数的反汇编指令中我们暂时只关心第3个指令call 80483eb <normal_function>。在汇编指令中call指令有以下3步操作：

将当前%eip寄存器的值修改为下一条指令所在地址。
将%eip寄存器入栈。
将%eip寄存器的值修改为call指令后4字节数。

我们来具体看一下main函数在调用normal_function()的汇编指令：第1步call指令的下一条指令为mov %0x0,%eax这条指令所在内存地址为0x80483f9。则%eip被修改为0x80483f9。第2步将%eip入栈。第3步call指令后是一个4字节的数值0x80483eb，于是%eip要被修改为0x80483eb。当CPU在执行下一条语句时由于%eip为0x80483eb，所以程序执行的地址就是0x80483eb。而这个地址就是normal_function()函数所在的内存地址。

而在normal_function()函数中只有5个汇编指令，前面3个（nop指令代表不做任何动作）指令是C函数参数相关功能，我们会在下一节中做详细讨论，本节中只关心最后一句：ret。在汇编语言中ret指令执行时将栈顶元素出栈赋值给%eip，然后继续执行程序。

在main函数调用normal_function()时，栈顶元素值为0x80483f9，于是将其出栈赋值给%eip。再继续执行，此时%eip的值为0x80483f9也就是main函数中call指令的下一条指令mov %0x0,%eax于是程序成功的向下继续执行。

可以看到C语言在函数调用时采用call和ret汇编指令来完成相关功能。

二、函数返回值

首先我们要确定一个问题：在C语言函数只允许有一个返回值，这个返回值可以是void类型，表示无返回值。

确认了C语言函数的返回值只能有一个，下面我们就来看一下C语言函数用什么来作为函数的返回值。修改上面的例子，将normal_function()函数修改成为一个返回值为int的函数：

int normal_function(void)
{
        return 0x11223344;
}

int main(int argc, char **args)
{
        int val = normal_function();

        return 0;
}

Click and drag to move

再来看一下它的反汇编代码：

080483eb <normal_function>:
 80483eb:        55                           push   %ebp
 80483ec:        89 e5                        mov    %esp,%ebp
 80483ee:        b8 44 33 22 11               mov    $0x11223344,%eax
 80483f3:        5d                           pop    %ebp
 80483f4:        c3                           ret    

080483f5 <main>:
 80483f5:        55                           push   %ebp
 80483f6:        89 e5                        mov    %esp,%ebp
 80483f8:        83 ec 10                     sub    $0x10,%esp
 80483fb:        e8 eb ff ff ff               call   80483eb <normal_function>
 8048400:        89 45 fc                     mov    %eax,-0x4(%ebp)
 8048403:        b8 00 00 00 00               mov    $0x0,%eax
 8048408:        c9                           leave  
 8048409:        c3                           ret

Click and drag to move

注意normal_function中的第3个指令mov %0x11223344,%eax和main中call指令的下一条指令move %eax, -0x4(%ebp)。显然，这两指令有些关联，因为它们都用到了%eax寄存器，在normal_function()函数里为%eax寄存器设定了一个返回值0x11223344。而在main函数中call执行之后直接使用了%eax寄存器中的这个值并赋值给main函数中的一个变量（暂时不关心函数变量的内存地址问题，这部分内容将在下一节中学习），然后再将%eax清零。

也就是说C语言中函数的返回值是采用%eax寄存器来传递的，在被调用函数返回前设定返回值到%eax寄存器中，在调用函数中可以通过%eax来取得被调用函数的返回值。

那么，如果是其它类型的返回值呢？比如char、short、long、float和double等等？来看一下char类型做函数返回值的反汇编：

char normal_function(void)
{
        return 0x11;
}

int main(int argc, char **args)
{
        char val = normal_function();

        return 0;
}

080483eb <normal_function>:
 80483eb:        55                           push   %ebp
 80483ec:        89 e5                        mov    %esp,%ebp
 80483ee:        b8 11 00 00 00               mov    $0x11,%eax
 80483f3:        5d                           pop    %ebp
 80483f4:        c3                           ret    

080483f5 <main>:
 80483f5:        55                           push   %ebp
 80483f6:        89 e5                        mov    %esp,%ebp
 80483f8:        83 ec 10                     sub    $0x10,%esp
 80483fb:        e8 eb ff ff ff               call   80483eb <normal_function>
 8048400:        88 45 ff                     mov    %al,-0x1(%ebp)
 8048403:        b8 00 00 00 00               mov    $0x0,%eax
 8048408:        c9                           leave  
 8048409:        c3                           ret

可以看到在定义函数的返回值时类型为char，但其反汇编时仍是使用的%eax寄存器，而在main函数接收此返回值是使用的%al寄存器。因为char类型只占用1个byte，所以使用%al就够了。而函数返回值是short类型时，仍是使用%eax返回，main函数接收返回值时使用%ax，因为short类型占2个byte。int、long都占4byte所以返回和接收都是采用%eax寄存器。而对于大于4byte类型的long long作为返回值则采用的是%eax、%edx双寄存器来返回，%edx存储高位数值%eax存储低位数值：

long long normal_function(void)
{
        return 0x1122334455667788;
}

int main(int argc, char **args)
{
        long long  val = normal_function();

        return 0;
}



080483eb <normal_function>:
 80483eb:        55                           push   %ebp
 80483ec:        89 e5                        mov    %esp,%ebp
 80483ee:        b8 88 77 66 55               mov    $0x55667788,%eax
 80483f3:        ba 44 33 22 11               mov    $0x11223344,%edx
 80483f8:        5d                           pop    %ebp
 80483f9:        c3                           ret    

080483fa <main>:
 80483fa:        8d 4c 24 04                  lea    0x4(%esp),%ecx
 80483fe:        83 e4 f8                     and    $0xfffffff8,%esp
 8048401:        ff 71 fc                     pushl  -0x4(%ecx)
 8048404:        55                           push   %ebp
 8048405:        89 e5                        mov    %esp,%ebp
 8048407:        51                           push   %ecx
 8048408:        83 ec 14                     sub    $0x14,%esp
 804840b:        e8 db ff ff ff               call   80483eb <normal_function>
 8048410:        89 45 f0                     mov    %eax,-0x10(%ebp)
 8048413:        89 55 f4                     mov    %edx,-0xc(%ebp)
 8048416:        b8 00 00 00 00               mov    $0x0,%eax
 804841b:        83 c4 14                     add    $0x14,%esp
 804841e:        59                           pop    %ecx
 804841f:        5d                           pop    %ebp
 8048420:        8d 61 fc                     lea    -0x4(%ecx),%esp
 8048423:        c3                           ret

Click and drag to move

另外，对于float和double类型的函数返回值有一些不同，这2个类型为浮点型，函数返回值是通过浮点寄存器来传递的：

float normal_function(void)
{
        return 12.3;
}

double normal_function2(void)
{
        return 12.3;
}

int main(int argc, char **args)
{
        float val = normal_function();
        double val2 = normal_function2();

        return 0;
}

080483eb <normal_function>:
 80483eb:        55                           push   %ebp
 80483ec:        89 e5                        mov    %esp,%ebp
 80483ee:        d9 05 c8 84 04 08            flds   0x80484c8
 80483f4:        5d                           pop    %ebp
 80483f5:        c3                           ret    

080483f6 <normal_function2>:
 80483f6:        55                           push   %ebp
 80483f7:        89 e5                        mov    %esp,%ebp
 80483f9:        dd 05 d0 84 04 08            fldl   0x80484d0
 80483ff:        5d                           pop    %ebp
 8048400:        c3                           ret    

08048401 <main>:
 8048401:        8d 4c 24 04                  lea    0x4(%esp),%ecx
 8048405:        83 e4 f8                     and    $0xfffffff8,%esp
 8048408:        ff 71 fc                     pushl  -0x4(%ecx)
 804840b:        55                           push   %ebp
 804840c:        89 e5                        mov    %esp,%ebp
 804840e:        51                           push   %ecx
 804840f:        83 ec 1c                     sub    $0x1c,%esp
 8048412:        e8 d4 ff ff ff               call   80483eb <normal_function>
 8048417:        d9 5d e4                     fstps  -0x1c(%ebp)
 804841a:        8b 45 e4                     mov    -0x1c(%ebp),%eax
 804841d:        89 45 f4                     mov    %eax,-0xc(%ebp)
 8048420:        e8 d1 ff ff ff               call   80483f6 <normal_function2>
 8048425:        dd 5d e8                     fstpl  -0x18(%ebp)
 8048428:        b8 00 00 00 00               mov    $0x0,%eax
 804842d:        83 c4 1c                     add    $0x1c,%esp
 8048430:        59                           pop    %ecx
 8048431:        5d                           pop    %ebp
 8048432:        8d 61 fc                     lea    -0x4(%ecx),%esp
 8048435:        c3                           ret

Click and drag to move

关于CPU的各个寄存器%eip、%eax、%edx、%esp、%ebp和浮点寄存器的相关知识并不是本教程的学习内容，这里不做过多的解释，请大家去学习关于汇编的相关知识。