kk Blog —— 通用基础

date [-d @int|str] [+%s|"+%F %T"]

the meaning of '?' in Linux kernel panic call trace

  • ‘?’ means that the information about this stack entry is probably not reliable.

The stack output mechanism (see the implementation of dump_trace() function) was unable to prove that the address it has found is a valid return address in the call stack.

‘?’ itself is output by printk_stack_address().

The stack entry may be valid or not. Sometimes one may simply skip it. It may be helpful to investigate the disassembly of the involved module to see which function is called at ClearFunctionName+0x88 (or, on x86, immediately before that position).

Concerning reliability

On x86, when dump_stack() is called, the function that actually examines the stack is print_context_stack() defined in arch/x86/kernel/dumpstack.c. Take a look at its code, I’ll try to explain it below.

I assume DWARF2 stack unwind facilities are not available in your Linux system (most likely, they are not, if it is not OpenSUSE or SLES). In this case, print_context_stack() seems to do the following.

It starts from an address (‘stack’ variable in the code) that is guaranteed to be an address of a stack location. It is actually the address of a local variable in dump_stack().

The function repeatedly increments that address (while (valid_stack_ptr …) { … stack++}) and checks if what it points to could also be an address in the kernel code (if (__kernel_text_address(addr)) …). This way it attempts to find the functions' return addresses pushed on stack when these functions were called.

Of course, not every unsigned long value that looks like a return address is actually a return address. So the function tries to check it. If frame pointers are used in the code of the kernel (%ebp/%rbp registers are employed for that if CONFIG_FRAME_POINTER is set), they can be used to traverse the stack frames of the functions. The return address for a function lies just above the frame pointer (i.e. at %ebp/%rbp + sizeof(unsigned long)). print_context_stack checks exactly that.

If there is a stack frame for which the value ‘stack’ points to is the return address, the value is considered a reliable stack entry. ops->address will be called for it with reliable == 1, it will eventually call printk_stack_address() and the value will be output as a reliable call stack entry. Otherwise the address will be considered unreliable. It will be output anyway but with ‘?’ prepended.

[NB] If frame pointer information is not available (e.g. like it was in Debian 6 by default), all call stack entries will be marked as unreliable for this reason.

The systems with DWARF2 unwinding support (and with CONFIG_STACK_UNWIND set) is a whole another story.

centos系统各种包下载

0 centos 系統原包

爬取el7所有版本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import re
import urllib

def getHtml(url):
	page = urllib.urlopen(url)
	html = page.read()
	return html

def getHref(html, reg):
	reg = re.compile(reg)
	reslist = re.findall(reg, html)
	return reslist

URL = "https://buildlogs.centos.org/"
html = getHtml(URL)
c7Href = getHref(html, r'href="(c7.+)/"')
for ver in c7Href:
	if '.a32' in ver or '.a64' in ver or '.p32' in ver or '.i386' in ver:
		continue
	url1 = URL + ver + "/kernel/"
	print url1
	html = getHtml(url1)
	dateHref = getHref(html, r'href="(20............)/"')
	for date in dateHref:
		url2 = url1 + date + "/"
		html = getHtml(url2)
		kernelHref = getHref(html, r'href="(.+el7.x86_64)/"')
		for kver in kernelHref:
			print url2 + kver

https://buildlogs.centos.org/c7-dotnet/kernel/
https://buildlogs.centos.org/c7-epel/kernel/
https://buildlogs.centos.org/c7-extras.x86_64/kernel/
https://buildlogs.centos.org/c7-plus.x86_64/kernel/
https://buildlogs.centos.org/c7-plus/kernel/
https://buildlogs.centos.org/c7-rt/kernel/
https://buildlogs.centos.org/c7-updates/kernel/
https://buildlogs.centos.org/c7.00.02/kernel/
https://buildlogs.centos.org/c7.00.02/kernel/20140529190808/3.10.0-121.el7.x86_64
https://buildlogs.centos.org/c7.00.03/kernel/
https://buildlogs.centos.org/c7.00.03/kernel/20140609184350/3.10.0-121.el7.x86_64
https://buildlogs.centos.org/c7.00.04/kernel/
https://buildlogs.centos.org/c7.00.04/kernel/20140612172658/3.10.0-123.el7.x86_64
https://buildlogs.centos.org/c7.00.04/kernel/20140619231033/3.10.0-123.el7.x86_64
https://buildlogs.centos.org/c7.01.00/kernel/
https://buildlogs.centos.org/c7.01.00/kernel/20150306113403/3.10.0-229.el7.x86_64
https://buildlogs.centos.org/c7.01.u/kernel/
https://buildlogs.centos.org/c7.01.u/kernel/20150327030147/3.10.0-229.1.2.el7.x86_64
https://buildlogs.centos.org/c7.01.u/kernel/20150513100324/3.10.0-229.4.2.el7.x86_64
https://buildlogs.centos.org/c7.01.u/kernel/20150623220331/3.10.0-229.7.2.el7.x86_64
https://buildlogs.centos.org/c7.01.u/kernel/20150806010338/3.10.0-229.11.1.el7.x86_64
https://buildlogs.centos.org/c7.01.u/kernel/20150915124206/3.10.0-229.14.1.el7.x86_64
https://buildlogs.centos.org/c7.01.u/kernel/20150915150313/3.10.0-229.14.1.el7.x86_64
https://buildlogs.centos.org/c7.01.u/kernel/20151103190728/3.10.0-229.20.1.el7.x86_64
https://buildlogs.centos.org/c7.1511.00/kernel/
https://buildlogs.centos.org/c7.1511.00/kernel/20151119220809/3.10.0-327.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/
https://buildlogs.centos.org/c7.1511.exp/kernel/20151016161452/4.2.0-1.centos.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20151016163253/4.2.0-1.centos.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20151016164628/4.2.0-1.centos.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160321183722/4.3.3-200.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160324145107/4.4.6-301.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160324192831/4.4.6-301.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160325232209/4.4.6-301.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160415133359/4.4.7-301.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160506113850/4.4.9-301.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160601130532/4.4.11-301.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160602142804/4.4.12-301.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160608070903/4.4.13-301.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160620154312/4.4.13-303.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160625132228/4.4.14-201.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160625133615/4.4.14-201.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160815150500/4.4.17-201.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160815161333/4.4.17-201.el7.x86_64
https://buildlogs.centos.org/c7.1511.exp/kernel/20160817141019/4.4.18-201.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/
https://buildlogs.centos.org/c7.1511.u/kernel/20151209124337/3.10.0-327.3.1.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20151209140627/3.10.0-327.3.1.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20160105150501/3.10.0-327.4.4.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20160125220424/3.10.0-327.4.5.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20160217024115/3.10.0-327.10.1.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20160331160950/3.10.0-327.13.1.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20160512110105/3.10.0-327.18.2.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20160623161521/3.10.0-327.22.2.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20160802204906/3.10.0-327.28.2.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20160818163946/3.10.0-327.28.3.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20160918123639/3.10.0-327.36.1.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20161010214658/3.10.0-327.36.2.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20161010215511/3.10.0-327.36.2.el7.x86_64
https://buildlogs.centos.org/c7.1511.u/kernel/20161024152721/3.10.0-327.36.3.el7.x86_64
https://buildlogs.centos.org/c7.1611.00/kernel/
https://buildlogs.centos.org/c7.1611.01/kernel/
https://buildlogs.centos.org/c7.1611.01/kernel/20161117160457/3.10.0-514.el7.x86_64
https://buildlogs.centos.org/c7.1611.exp/kernel/
https://buildlogs.centos.org/c7.1611.exp/kernel/20171018140113/4.9.57-204.el7.x86_64
https://buildlogs.centos.org/c7.1611.exp/kernel/20171120151900/4.9.63-204.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/
https://buildlogs.centos.org/c7.1611.u/kernel/20161207134106/3.10.0-514.2.2.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/20170118010633/3.10.0-514.6.1.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/20170223034721/3.10.0-514.2.2.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/20170303004149/3.10.0-514.10.2.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/20170412150118/3.10.0-514.16.1.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/20170525170145/3.10.0-514.21.1.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/20170620122143/3.10.0-514.21.2.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/20170620132051/3.10.0-514.21.2.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/20170628200657/3.10.0-514.26.1.el7.x86_64
https://buildlogs.centos.org/c7.1611.u/kernel/20170704132018/3.10.0-514.26.2.el7.x86_64
https://buildlogs.centos.org/c7.1708.00/kernel/
https://buildlogs.centos.org/c7.1708.00/kernel/20170822030048/3.10.0-693.el7.x86_64
https://buildlogs.centos.org/c7.1708.exp.x86_64/kernel/
https://buildlogs.centos.org/c7.1708.u.x86_64/kernel/
https://buildlogs.centos.org/c7.1708.u.x86_64/kernel/20170823130501/3.10.0-693.1.1.el7.x86_64
https://buildlogs.centos.org/c7.1708.u.x86_64/kernel/20170906160426/3.10.0-693.2.1.el7.x86_64
https://buildlogs.centos.org/c7.1708.u.x86_64/kernel/20170913001530/3.10.0-693.2.2.el7.x86_64
https://buildlogs.centos.org/c7.1708.u.x86_64/kernel/20171023132245/3.10.0-693.5.2.el7.x86_64
https://buildlogs.centos.org/c7.1708.u.x86_64/kernel/20171204203818/3.10.0-693.11.1.el7.x86_64
https://buildlogs.centos.org/c7.1708.u/kernel/
https://buildlogs.centos.org/c7.1708.u/kernel/20170823130501/3.10.0-693.1.1.el7.x86_64
https://buildlogs.centos.org/c7.1708.u/kernel/20170906160426/3.10.0-693.2.1.el7.x86_64
https://buildlogs.centos.org/c7.1708.u/kernel/20170913001530/3.10.0-693.2.2.el7.x86_64
https://buildlogs.centos.org/c7.1708.u/kernel/20171023132245/3.10.0-693.5.2.el7.x86_64
https://buildlogs.centos.org/c7.1708.u/kernel/20171204203818/3.10.0-693.11.1.el7.x86_64
https://buildlogs.centos.org/c7.common/kernel/

1、系统包

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
http://mirror.centos.org/centos/6.5/os/x86_64/Packages/
国内地址
http://isoredirect.centos.org/centos/6.5/isos/x86_64/  
ex:
	http://mirror.symnds.com/distributions/CentOS-vault/5.5/isos/x86_64/  
	http://mirrors.stuhome.net/centos/6.5/isos/x86_64/  
	http://mirrors.neusoft.edu.cn/centos/6.5/isos/x86_64/
	http://mirrors.163.com/centos/6.5/isos/x86_64/
	http://mirrors.hust.edu.cn/centos/6.5/isos/x86_64/
	http://centos.ustc.edu.cn/centos/6.5/isos/x86_64/
	http://mirror.bit.edu.cn/centos/6.5/isos/x86_64/
	http://mirrors.tuna.tsinghua.edu.cn/centos/6.5/isos/x86_64/
	http://mirrors.grandcloud.cn/centos/6.5/isos/x86_64/
	http://mirror.neu.edu.cn/centos/6.5/isos/x86_64/
	http://mirrors.btte.net/centos/6.5/isos/x86_64/
	http://mirrors.hustunique.com/centos/6.5/isos/x86_64/
	http://mirrors.aliyun.com/centos/6.5/isos/x86_64/

2、debuginfo包:

1
http://debuginfo.centos.org/6/x86_64/

3、src.prm包

1
2
3
4
ftp://ftp.redhat.com/pub/redhat/linux/enterprise
ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Client/en/os/SRPMS/kexec-tools-1.102pre-154.el5.src.rpm
ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Client/en/os/SRPMS/kexec-tools-1.102pre-164.el5.src.rpm
http://vault.centos.org/5.11/os/SRPMS/kexec-tools-1.102pre-165.el5.src.rpm

4、各种包

1
pkgs/org

tsc时钟初始化

tsc时钟源初始化
1
2
3
4
5
6
7
8
9
//    调用路径:time_init->tsc_init
//    函数任务:
//        1.矫正tsc,获取tsc频率,设置cpu频率等于tsc频率
//        2.初始化基于tsc的延迟函数
//        3.检查tsc的特性
//            3.1 tsc之间是否同步
//                3.1.1 如果tsc之间不同步,标记tsc不稳定,设置rating=0
//            3.2 tsc是否稳定
//        4.注册tsc时钟源设备
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
void __init tsc_init(void)
{
	u64 lpj;
	int cpu;

	//矫正tsc,获取tsc频率
	tsc_khz = x86_platform.calibrate_tsc();
	//cpu频率等于tsc频率
	cpu_khz = tsc_khz;
	//计算辅助cycle到ns转换的辅助参数scale
	for_each_possible_cpu(cpu)
	    set_cyc2ns_scale(cpu_khz, cpu);
	//初始化基于tsc的延迟函数,ndely,udelay,mdelay
	use_tsc_delay();
	//检查cpu之间tsc是否同步
	if (unsynchronized_tsc())
	    mark_tsc_unstable("TSCs unsynchronized");
	//检查tsc是否可靠
	check_system_tsc_reliable();
	//注册tsc时钟源设备
	init_tsc_clocksource();
}
延迟函数ndelay,udelay,mdelay

通过tsc实现短延迟

1
2
3
4
5
void use_tsc_delay(void)
{
	//通过tsc进行短延迟
	delay_fn = delay_tsc;
}
tsc延迟函数

通过rep_nop实现轮询时的短延迟,查询tsc时禁止内核抢占,确保不受不同cpu间影响。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
static void delay_tsc(unsigned long loops)
{
	unsigned long bclock, now;
	int cpu;
	//短延迟,禁止内核抢占
	preempt_disable();
	//delay_tsc当前运行的cpu
	cpu = smp_processor_id();
	rdtsc_barrier();
	rdtscl(bclock);
	for (;;) {
	    rdtsc_barrier();
	    rdtscl(now);
	    if ((now - bclock) >= loops)
	        break;
	    //允许rt策略进程运行
	    preempt_enable();
	    //空操作
	    rep_nop();
	    preempt_disable();

	    //delay_tsc在运行过程中,可能会迁移到不同的cpu
	    //tsc
	    if (unlikely(cpu != smp_processor_id())) {
	        loops -= (now - bclock);
	        cpu = smp_processor_id();
	        rdtsc_barrier();
	        rdtscl(bclock);
	    }
	}
	preempt_enable();
}
检查tsc是否同步
1
2
3
4
5
6
//    调用路径:tsc_init->unsynchronized_tsc
//    检查办法:
//        1.如果apic在多块板卡,则tsc不同步
//        2.如果cpuid显示具有稳定的tsc,则tsc同步
//        3.intel cpu的tsc都是同步的
//        4.默认其他品牌的多核的tsc不同步
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
__cpuinit int unsynchronized_tsc(void)
{
	//如果apic分布在多块板卡上,tsc可能不同步
	if (apic_is_clustered_box())
	    return 1;
	//cpu具有稳定的tsc
	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
	    return 0;
	//intel cpu的tsc都是同步的
	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) {
	    //非intel cpu,如果cpu个数>1,则认为不同步
	    if (num_possible_cpus() > 1)
	        tsc_unstable = 1;
	}
	return tsc_unstable;
}
标记tsc不稳定
1
2
3
4
//    调用路径:tsc_init->mark_tsc_unstable
//    函数任务:
//        1.如果tsc时钟已经注册,异步设置tsc的rating=0,标识其不稳定
//        2.如果tsc时钟还未注册,同步设置tsc的rating=0,标识其不稳定
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
void mark_tsc_unstable(char *reason)
{
	if (!tsc_unstable) {
	    tsc_unstable = 1;
	    sched_clock_stable = 0;
	    //tsc已经注册,
	    if (clocksource_tsc.mult)
	    {
	        clocksource_mark_unstable(&clocksource_tsc);
	    }
	    //如果tsc时钟源未注册,修改rating为最低,从而不会被当做最佳的时钟源
	    else {
	        clocksource_tsc.flags |= CLOCK_SOURCE_UNSTABLE;
	        clocksource_tsc.rating = 0;
	    }
	}
}
注册tsc时钟源
1
2
3
4
5
6
//    函数任务:
//        1.计算tsc的mult
//        2.检查tsc是否稳定
//            2.1 如果tsc不稳定,降低其rating,清除时钟源连续标志
//        3.向系统注册tsc clocksource
//    调用路径:tsc_init->init_tsc_clocksource
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
static void __init init_tsc_clocksource(void)
{
	// 计算tsc的mult
	clocksource_tsc.mult = clocksource_khz2mult(tsc_khz,
	        clocksource_tsc.shift);
	// 如果tsc的可靠性已经验证,则清除 必须验证 标记
	if (tsc_clocksource_reliable)
	    clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
	
	// 检查tsc是否稳定
	// 在tsc_init前通过全局变量标记tsc是否稳定,可靠
	if (check_tsc_unstable()) {
	    // 如果tsc不稳定,则降低rating最低,清除连续标记
	    clocksource_tsc.rating = 0;
	    clocksource_tsc.flags &= ~CLOCK_SOURCE_IS_CONTINUOUS;
	}
	// 向系统注册tsc clocksource
	clocksource_register(&clocksource_tsc);
}

TSC时间错误

arch/x86/kernel/tsc.c:
开机初始化会调用tsc_init() -> set_cyc2ns_scale() 设置per_cpu变量cyc2ns、cyc2ns_offset。以供后面shced_clock()->native_sched_clock()->__cycles_2_ns()调用。

在cpufreq_tsc()中如果
//cpu具有稳定的tsc
if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
return 0;
所以一般不会注册time_cpufreq_notifier函数,也就不会再调用set_cyc2ns_scale。

  • 现象:top、ps出来的TIME和CPU的值非常异常。
1
2
3
4
5
6
7
8
9
10
11
// 查看TSC寄存器的值
#include <stdio.h>

int main()
{
	    unsigned long low, high, val;
	    asm volatile("rdtsc": "=a" (low), "=d" (high));
	    val = ((low) | ((unsigned long)(high) << 32));
	    printf("%lu\n", val);
	    return 0;
}

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=733043

Xeon E5 has a bug, it doesn’t reset TSC on warm reboot, just keep it instead. see “BT81. X X X No Fix TSC is Not Affected by Warm Reset” http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-family-spec-update.pdf

And also kernel 2.6.32 has a bug.
Xeon bug + kernel bug = hung after warm reboot (or kexec) after 208.5 days
since booting. So, administrators should shutdown it once at all, then
boot it again because “shutdown -r” causes hang up.

Red Hat has released a fix for this as kernel 2.6.32-220, 2.6.32-279
and 2.6.32-358 series (RHEL6.x) https://access.redhat.com/site/solutions/433883 (for detail subscriber only :-(

Attached patch is based on upstream patch.
see http://kernel.opensuse.org/cgit/kernel/patch/?id=9993bc635d01a6ee7f6b833b4ee65ce7c06350b1


Red Hat Enterprise Linux 6.1 (kernel-2.6.32-131.26.1.el6 and newer)
Red Hat Enterprise Linux 6.2 (kernel-2.6.32-220.4.2.el6 and newer)
Red Hat Enterprise Linux 6.3 (kernel-2.6.32-279 series)
Red Hat Enterprise Linux 6.4 (kernel-2.6.32-358 series)
Any Intel® Xeon® E5, Intel® Xeon® E5 v2, or Intel® Xeon® E7 v2 series processor


From 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 Mon Sep 17 00:00:00 2001
From: Salman Qazi <sqazi@google.com>
Date: Sat, 10 Mar 2012 00:41:01 +0000
Subject: sched/x86: Fix overflow in cyc2ns_offset

When a machine boots up, the TSC generally gets reset. However, when kexec is used to boot into a kernel, the TSC value would be carried over from the previous kernel. The computation of cycns_offset in set_cyc2ns_scale is prone to an overflow, if the machine has been up more than 208 days prior to the kexec. The overflow happens when we multiply *scale, even though there is enough room to store the final answer.

We fix this issue by decomposing tsc_now into the quotient and remainder of division by CYC2NS_SCALE_FACTOR and then performing the multiplication separately on the two components.

Refactor code to share the calculation with the previous fix in __cycles_2_ns().

Signed-off-by: Salman Qazi <sqazi@google.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Turner <pjt@google.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20120310004027.19291.88460.stgit@dungbeetle.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>


patch: http://kernel.opensuse.org/cgit/kernel/patch/?id=9993bc635d01a6ee7f6b833b4ee65ce7c06350b1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
index 431793e..34baa0e 100644
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -57,14 +57,10 @@ DECLARE_PER_CPU(unsigned long long, cyc2ns_offset);
 
 static inline unsigned long long __cycles_2_ns(unsigned long long cyc)
 {
- unsigned long long quot;
- unsigned long long rem;
  int cpu = smp_processor_id();
  unsigned long long ns = per_cpu(cyc2ns_offset, cpu);
- quot = (cyc >> CYC2NS_SCALE_FACTOR);
- rem = cyc & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
- ns += quot * per_cpu(cyc2ns, cpu) +
-     ((rem * per_cpu(cyc2ns, cpu)) >> CYC2NS_SCALE_FACTOR);
+ ns += mult_frac(cyc, per_cpu(cyc2ns, cpu),
+         (1UL << CYC2NS_SCALE_FACTOR));
  return ns;
 }
 
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a62c201..183c592 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -620,7 +620,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
  if (cpu_khz) {
      *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
-     *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+     *offset = ns_now - mult_frac(tsc_now, *scale,
+                      (1UL << CYC2NS_SCALE_FACTOR));
  }
 
  sched_clock_idle_wakeup_event(0);
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index e834342..d801acb 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -85,6 +85,19 @@
 }                            \
 )
 
+/*
+ * Multiplies an integer by a fraction, while avoiding unnecessary
+ * overflow or loss of precision.
+ */
+#define mult_frac(x, numer, denom)(          \
+{                            \
+ typeof(x) quot = (x) / (denom);         \
+ typeof(x) rem  = (x) % (denom);         \
+ (quot * (numer)) + ((rem * (numer)) / (denom)); \
+}                            \
+)
+
+
 #define _RET_IP_     (unsigned long)__builtin_return_address(0)
 #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; }) 

mark problem

C语言中宽字符和多字节字符 MB_CUR_MAX

stdlib.h
MB_CUR_MAX 当前locale中多字节字符的最大字节数目
如果把字符串当成多字节字符处理会慢很多
sort命令的一些版本中会根据locale中的LANG来设置比较函数,如果没设置好会导致sort跑的很慢。

内核栈溢出

因为内核栈中栈底保存thread_info,所以如果内核栈溢出会破坏thread_info,这样当进程发生睡眠、中断、抢占等调度时就会出错。
容易报try_to_wakeup+XXX错误,等等

定时器 timer 注意

如果mod_timer设置成马上执行,然后下面又是setup_timer。
有可能的情况是:进入了中断执行过程,可是setup_timer又改变timer结构,导致继续执行时用到timer结构出错。

查看内核栈

用 echo t > /proc/sysrq-trigger 把内核栈整个打出来。
可以看/proc/{pid}/wchan,里面是该进程阻塞位置的内核函数名,在所有办法都没戏的时候可以看它。

mark

修复ext4日志(jbd2)bug
rhel6再次发现jbd2的bug
rhel6 的软RAID问题
stable pages
追踪CPU跑满