kk Blog —— 通用基础

获取Linux内核未导出符号

2013-05-07 18:16:00

从Linux内核的2.6某个版本开始，内核引入了导出符号的机制。只有在内核中使用EXPORT_SYMBOL或EXPORT_SYMBOL_GPL导出的符号才能在内核模块中直接使用。

然而，内核并没有导出所有的符号。例如，在3.8.0的内核中，do_page_fault就没有被导出。

而我的内核模块中需要使用do_page_fault，那么有那些方法呢？这些方法分别有什么优劣呢？

下面以do_page_fault为例，一一进行分析：
修改内核，添加EXPORT_SYMBOL(do_page_fault)或EXPORT_SYMBOL_GPL(do_page_fault)。
这种方法适用于可以修改内核的情形。在可以修改内核的情况下，这是最简单的方式。

使用kallsyms_lookup_name读取

kallsyms_lookup_name本身也是一个内核符号，如果这个符号被导出了，那么就可以在内核模块中调用kallsyms_lookup_name(“do_page_fault”)来获得do_page_fault的符号地址。
这种方法的局限性在于kallsyms_lookup_name本身不一定被导出。

读取/boot/System.map-，再使用内核模块参数传入内核模块

System.map-是编译内核时产生的，它里面记录了编译时内核符号的地址。如果能够保证当前使用的内核与 System.map-是一一对应的，那么从System.map-中读出的符号地址就是正确的。其中，kernel-version可以通过'uname -r'获得。
但是这种方法也有局限性，在模块运行的时候，System.map-文件不一定存在，即使存在也不能保证与当前内核是正确对应的。

读取/proc/kallsyms，再使用内核模块参数传入内核模块

/proc/kallsyms是一个特殊的文件，它并不是存储在磁盘上的文件。这个文件只有被读取的时候，才会由内核产生内容。因为这些内容是内核动态生成的，所以可以保证其中读到的地址是正确的，不会有System.map-的问题。
需要注意的是，从内核 2.6.37开始，普通用户是没有办法从/proc/kallsyms中读到正确的值(需要内核指针的禁用/proc/sys/kernel/kptr_restrict设置为0)。在某些版本中，该文件为空，在较新的版本中，该文件中所有符号的地址均为0。但是root用户是可以从/proc/kallsyms中读到正确的值的。好在加载模块也需要root权限，可以在加载模块时用脚本获取符号的地址。命令：

#cat /proc/kallsyms | grep "\<do_page_fault\>" | awk '{print $1}'

内核符号表中，第一列为函数或变量的在内核中的地址，第二列为符号的类型，第三列为符号名，第四列为符号所属的模块。若第四列为空，则表示该符号属于内核代码。

符号属性    含义
b    符号在未初始化数据区（BSS）
c    普通符号，是未初始化区域
d    符号在初始化数据区
g    符号针对小object，在初始化数据区
i    非直接引用其他符号的符号
n    调试符号
r    符号在只读数据区
s    符号针对小object，在未初始化数据区
t    符号在代码段
u    符号未定义

若符号在内核中是全局性的，则属性为大写字母，如T、U等。

C语言输出缓冲区函数说明

2013-05-07 18:15:00

#include <stdio.h>
#include <unistd.h>

int main(void)
{
	int i = 0;
	while(1) {
		printf("sleeping %d", i++); //(1)
		fflush(stdout);
		sleep(1);
	}
	return 0;
}

1

printf将"sleeping %d"输出到标准输出文件的缓冲区中(缓冲区在内存上)，fflush(stdout)将缓冲区中的内容强制刷新到，并将其中的内容输出到显示器上(“\n"回车换行 == fflush(stdout)+换行)

fflush()
buffer(In memroy) -----------> hard disk/monitor

2

有三个流(stream)是自动打开的，相应的FILE结构指针为stdin、stdout、stderr，与之对应的文件描述符是：STDIN_FILENO、STDOUT_FILENO、STDERR_FILENO。

流缓冲的属性：

缓冲区类型有：全缓冲(大部分缓冲都是这类型)、行缓冲(例如stdio,stdout)、无缓冲(例如stderr)。
关于全缓冲，例如普通的文件操作，进行fputs、fprintf操作后，数据并没有立即写入磁盘文件中，当fflush或fclose文件时，数据才真正写入。
可以用以下函数设置流的缓冲类型：

void setvbuf()  
void setbuf()  
void setbuffer()  
void setlinebuf()

3

fflush() 是把 FILE *里的缓冲区(位于用户态进程空间)刷新到内核中
fsync() -是把内核中对应的缓冲(是在 vfs 层的缓冲)刷新到硬盘中

4

在Linux的标准函数库中，有一套称作“高级I/O”的函数，我们熟知的printf()、fopen()、fread()、fwrite()都在此列，它们也被称作“缓冲I/O（buffered I/O）”，每次写文件的时候，也仅仅是写入内存中的缓冲区，等满足了一定的条件（达到一定数量，或遇到特定字符，如换行符\n和文件结束符EOF），再将缓冲区中的内容一次性写入文件，这样就大大增加了文件读写的速度。

The three types of buffering available are unbuffered, block buffered, and line buffered. When an output stream is unbuffered, information appears on the destination file or terminal as soon as written; when it is block buffered many characters are saved up and written as a block; when it is line buffered characters are saved up until a newline is output or input is read from any stream attached to a terminal device (typically stdin). The function fflush(3) may be used to force the block out early. (See fclose(3).) Normally all files are block buffered. When the first I/O operation occurs on a file, malloc(3) is called, and a buffer is obtained. If a stream refers to a terminal (as stdout normally does) it is line buffered. The standard error stream stderr is always unbuffered by default.

一般来说，block buffered的效率高些，将多次的操作合并成一次操作。先在标准库里缓存一部分，直到该缓冲区满了，或者程序显示的调用fflush时，将进行更新操作。而setbuf 则可以设置该缓冲区的大小。

setbuf()

#include <stdio.h>
void setbuf(FILE *stream, char *buf);

这个函数应该必须在如何输出被写到该文件之前调用。一般放在main里靠前面的语句！但是setbuf有个经典的错误，man手册上也提到了，c陷阱和缺陷上也提到了 You must make sure that both buf and the space it points to still exist by the time stream is closed, which also happens at program termination. For example, the following is illegal:

#include <stdio.h>
int main()
{
	char buf[BUFSIZ];
	setbuf(stdin, buf);
	printf("Hello, world!\n");
	return 0;
}

这个程序是错误的。buf缓冲区最后一次清空应该在main函数结束之后，程序交回控制给操作系统之前C运行库所必须进行的清理工作的一部分，但是此时 buf字符数组已经释放。修改的方法是将buf设置为static，或者全局变量；或者调用malloc来动态申请内存。

char * malloc();
setbuf(stdout,malloc(BUFSIZE));

这里不需要判断malloc的返回值，如果malloc调用失败，将返回一个null指针，setbuf的第二个参数可以是null,此时不进行缓冲！

fflush()

fflush函数则刷新缓冲区，将缓冲区上的内容更新到文件里。

#include <stdio.h>
int fflush(FILE *stream);

The function fflush forces a write of all user-space buffered data for the given output or update stream via the stream underlying write function. The open status of the stream is unaffected. If the stream argument is NULL, fflush flushes all open output streams.

但是fflush仅仅刷新C库里的缓冲。其他的一些数据的刷新需要调用fsync或者sync!

Note that fflush() only flushes the user space buffers provided by the C library. To ensure that the data is physically stored on disk the kernel buffers must be flushed too, e.g. with sync(2) or fsync(2).

fsync()和sync()

fsync和sync最终将缓冲的数据更新到文件里。

#include <unistd.h>
int fsync(int fd);

fsync copies all in-core parts of a file to disk, and waits until the device reports that all parts are on stable storage. It also updates metadata stat information. It does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicit fsync on the file descriptor of the directory is also needed.

同步命令sync就直接调用了sync函数来更新磁盘上的缓冲！

通过绝对内存地址进行参数赋值与函数调用

2013-05-07 18:15:00

同一个数可以通过不同的方式表达出来，对于函数的访问，变量的赋值除了直接对变量赋值以外，还可以通过绝对内存地址进行参数赋值与函数调用。

1、通过地址修改变量的值

int x;
int *p;
printf("%x\n",&x);
p=(int *)0x0012ff60;
*p = 3;
printf("%d\n",x);

程序的输出结果为：
12ff603

程序首先输出变量x所在地址为十六进制的0x12ff60（本来应该为8位的十六进制数，高位为0则省略掉），然后定义一个指针变量，让它指向该地址，通过指针变量的值来修改变量x的值。

示例代码：

int *ptr=(int*)0xa4000000;
*ptr=0xaabb;
printf("%d\n",*ptr);

以上程序会崩溃，因为这样做会给一个指针分配一个随意的地址，很危险，所以这种做法是不允许的。

2、通过地址调用函数的执行

#include <iostream>using namespace std; 
typedef void(*FuncPtr)() ;
 
void  p()
{ 
	printf("MOP\n");
}   
 
int main()
{
	void (*ptr)();
	p();
	printf("%x\n",p);
	ptr = (void (*)())0x4110f0;
	ptr();//函数指针执行
	((void (*)())0x4110f0)();
	((FuncPtr)0x4110f0)();
	return 0;
}

程序执行结果如下：
MOP4110f0MOP
MOP
MOP

首先定义一个ptr的函数指针，第一次通过函数名调用函数，输出Mop，打印函数的入口地址，函数的入口地址为4110f0。然后给函数指针ptr赋地址值为p的入口地址，调用ptr，输出Mop。接着的过程是不通过函数指针直接执行，仍然使用p的入口地址调用，输出为MOP。最后是通过typedef调用的直接执行。

函数名称、代码都是放在代码段的，因为是放在代码段，每次会跳到相同的地方，但参数会压栈，所以函数只根据函数名来获取入口地址，与参数和返回值无关。无论参数和返回值如何不同，函数入口地址都是一个地方。

对以下程序进行分析如下：

#include <stdio.h> int   p(int a,int b) 
{ 
	return 3;
}   
 
int main()
{
	printf("%x\n",p);
	int a = p(2,3);
	printf("%d\n",p);
	int b = p(4,5);
	printf("%x\n",p);
	return 0;
}

程序输出结果如下：
4111594264281411159
十六进制的411159转换成十进制的值为4264281。程序中打印的p的入口地址，无论p是否调用函数，入口地址都没有改变。分析如下代码：

#include <stdio.h> int  p(int a,int b) 
{ 
	return ((a>b)?a:b);
}  
int main()
{
	int (*ptr)(int ,int);
	ptr = (int (*)(int,int))0x411159;
	int c = ptr(5,6);
	printf("%d\n",c);
	return 0;
}

程序输出为:
6 通过函数指针调用有返回值和参数的函数，不适用函数名，而是用函数入口地址调用。
函数存放在内存的代码区域内，也有地址，一个函数在编译时被分配一个入口地址，将这个入口地址称为函数的指针，函数的地址就是函数的名字。函数指针不能指向不同类型或是带不同形参的函数。

← Older Blog Archives Newer →