博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
rhel7.2 优化技巧
阅读量:4079 次
发布时间:2019-05-25

本文共 12203 字,大约阅读时间需要 40 分钟。

对 journal 信息永久化保存方法

针对 systemd-journald.service 服务默认只把信息存放与内存进行优化,  改变当服务器长期启动,  无法利用 journald 命令获取服务状态信息修改 /etc/systemd/journald.conf 添加 Storage=persistent可利用系统文件对 journal 信息进行永久保存

优化 sysct.conf

/proc/sys/kernel/panic   10                                        (kernel panic 重启时间)/proc/sys/kernel/perf_event_max_sample_rate  100000                (提高 kernel 中断)/proc/sys/kernel/printk       7       4       1       7            (内核调试信息打印)/proc/sys/net/netfilter/nf_conntrack_max   4194304                 (增加 iptables 链路)  echo $[128*1024*1024*1024/16384/2]/proc/sys/net/netfilter/nf_conntrack_buckets  524288               (hash size)   echo $[128*1024*1024*1024/131072/2]/proc/sys/vm/dirty_ratio   30/proc/sys/vm/swappiness    10/proc/sys/vm/overcommit_memory   2                                 (拒绝内存超配)/proc/sys/vm/max_map_count               可 > 65535                (某个进程可能使用的最大内存映射区域)

PCP 使用参考

参考: http://www.pcp.io/docs/guide.html

PCP主要用于对系统进程进行分析, 显示当前进程资源使用情况, 可以根据返回值对进程资源使用进行判断

查询可用项目pminfo -h localhost

例子: 查询磁盘启动到现在的read 信息

[root@gx-yun-084036 .ssh]# pminfo -h localhost -dfmtT disk.partitions.read_bytes    disk.partitions.read_bytes PMID: 60.10.6 [number of bytes read for storage partitions]            Data Type: 32-bit unsigned int  InDom: 60.10 0xf00000a            Semantics: counter  Units: Kbyte    Help:    Cumulative number of bytes read since system boot time (subject to    counter wrap) for individual disk partitions or logical volumes.            inst [0 or "sda1"] value 22367            inst [1 or "sda2"] value 592513            inst [2 or "sda3"] value 1424            inst [3 or "sdb1"] value 557470            inst [4 or "sdc1"] value 1568

持续地观察当前磁盘的读写状态

[root@gx-yun-084036 .ssh]#  pmval -t 2sec -f 3 disk.partitions.write -h localhostmetric:    disk.partitions.writehost:      gx-yun-084036.vclound.comsemantics: cumulative counter (converting to rate)units:     count (converting to count / sec)samples:   all         sda1                  sda2                  sda3                  sdb1                  sdc1        0.000                 0.000                 0.000                 0.000                 0.000        0.000                 0.000                 0.000                 0.000                 0.000        0.000                 2.498                 0.000                 0.500                 0.500        0.000                 2.498                 0.000                 0.000                 0.000        0.000                 0.000                 0.000                 0.000                 0.000

另外一个例

[root@gx-yun-084036 .ssh]# pmdumptext -Xlimu -t 2sec 'kernel.all.load[1]' mem.util.used disk.partitions.write -h localhost[ 1] localhost:kernel.all.load["1 minute"][ 2] localhost:mem.util.used[ 3] localhost:disk.partitions.write["sda1"][ 4] localhost:disk.partitions.write["sda2"][ 5] localhost:disk.partitions.write["sda3"][ 6] localhost:disk.partitions.write["sdb1"][ 7] localhost:disk.partitions.write["sdc1"]             Column          1       2       3       4       5       6       7             Source     localh  localh  localh  localh  localh  localh  localh             Metric       load    used   write   write   write   write   write               Inst     1 minu     n/a    sda1    sda2    sda3    sdb1    sdc1              Units       none       b     c/s     c/s     c/s     c/s     c/sMon Sep 26 16:20:07      0.08   14.78G       ?       ?       ?       ?       ?Mon Sep 26 16:20:09      0.08   14.78G   0.00    0.00    0.00    0.00    0.00Mon Sep 26 16:20:11      0.08   14.78G   0.00   36.00    0.00    0.00    0.00Mon Sep 26 16:20:13      0.07   14.78G   0.00    2.50    0.00    0.50    0.50Mon Sep 26 16:20:15      0.07   14.78G   0.00    0.00    0.00    0.50    0.00Mon Sep 26 16:20:17      0.07   14.78G   0.00    0.00    0.00    0.00    0.00

pcp atop 可实时监控当前资源(RHEL7.2 以上可用)

实时监控系统资源

[root@gx-yun-084036 .ssh]# pmcollectl#<---------CPU---------><------------Disks----------><--------Network---------->#cpu    sys inter  ctxsw KBRead  Reads KBWrit Writes KBIn  PktIn  KBOut  PktOut     0     0   888   1773      0      0     8      6    54     81      7     73     0     0   620   1712      0      0     4      3    55     96      7     88     0     0   736   1539      0      0     0      0   187    229     32    208     0     0   647   1222      0      0     0      0   210    206     18    159     0     0   555   1061      0      0    64      2    61     90     10     83     0     0   432    840      0      0     0      0    84    110     11    101     0     0   511   1025      0      0     0      0     7     49      7     49

stap 监控参考

stap 一个不错的系统监控工具, 能够满足对进程的监控

yum install -y  systemtap*

其中 systemtap-client 软件包中包含了一些常用的管理脚本, 当然, 可以自行进行编程实现对系统的监控显示

需要安装下面软件包才可以满足 stap 命令使用

rpm -ivh kernel-debuginfo-3.10.0-327.el7.x86_64.rpm  kernel-debuginfo-common-x86_64-3.10.0-327.el7.x86_64.rpm

当前系统进程 IO 最猛的几个进程

iostop.stp

#!/usr/bin/stapglobal io_stat,deviceglobal read_bytes,write_bytesprobe vfs.read.return {  if ($return>0) {    if (devname!="N/A") {
/*skip read from cache*/ io_stat[pid(),execname(),uid(),ppid(),"R"] += $return device[pid(),execname(),uid(),ppid(),"R"] = devname read_bytes += $return } }}probe vfs.write.return { if ($return>0) { if (devname!="N/A") { /*skip update cache*/ io_stat[pid(),execname(),uid(),ppid(),"W"] += $return device[pid(),execname(),uid(),ppid(),"W"] = devname write_bytes += $return } }}probe timer.ms(5000) { /* skip non-read/write disk */ if (read_bytes+write_bytes) { printf("\n%-25s, %-8s%4dKb/sec, %-7s%6dKb, %-7s%6dKb\n\n", ctime(gettimeofday_s()), "Average:", ((read_bytes+write_bytes)/1024)/5, "Read:",read_bytes/1024, "Write:",write_bytes/1024) /* print header */ printf("%8s %8s %8s %25s %8s %4s %12s\n", "UID","PID","PPID","CMD","DEVICE","T","BYTES") } /* print top ten I/O */ foreach ([process,cmd,userid,parent,action] in io_stat- limit 10) printf("%8d %8d %8d %25s %8s %4s %12d\n", userid,process,parent,cmd, device[process,cmd,userid,parent,action], action,io_stat[process,cmd,userid,parent,action]) /* clear data */ delete io_stat delete device read_bytes = 0 write_bytes = 0}probe end{ delete io_stat delete device delete read_bytes delete write_bytes}

参考

[root@gx-yun-084036 io]# cat iotop.stp

#!/usr/bin/stapglobal reads, writes, total_ioprobe vfs.read.return {    reads[execname()] += bytes_read}probe vfs.write.return {    writes[execname()] += bytes_written}# print top 10 IO processes every 5 secondsprobe timer.s(5) {    foreach (name in writes)        total_io[name] += writes[name]    foreach (name in reads)        total_io[name] += reads[name]    printf ("%16s\t%10s\t%10s\n", "Process", "KB Read", "KB Written")    foreach (name in total_io- limit 10)        printf("%16s\t%10d\t%10d\n", name,               reads[name]/1024, writes[name]/1024)    delete reads    delete writes    delete total_io    print("\n")}

效果如下 stap iotop.stp

Process           KB Read      KB Written              dd           2048036         2048000          docker                41               1   zabbix_agentd                39               0        cadvisor                24               0         dmsetup                15               0    ovsdb-server                 0               7            bash                 7               0       netplugin                 5               0    ovs-vswitchd                 2               0            sshd                 1               1

在一定时间内, 系统中那些文件执行了读写

iotime.stp

#!/usr/bin/stapglobal startglobal time_iofunction timestamp:long() { return gettimeofday_us() - start }function proc:string() { return sprintf("%d (%s)", pid(), execname()) }probe begin { start = gettimeofday_us() }global filehandles, fileread, filewriteprobe syscall.open.return {  filename = user_string($filename)  if ($return != -1) {    filehandles[pid(), $return] = filename  } else {    printf("%d %s access %s fail\n", timestamp(), proc(), filename)  }}probe syscall.read.return {  p = pid()  fd = $fd  bytes = $return  time = gettimeofday_us() - @entry(gettimeofday_us())  if (bytes > 0)    fileread[p, fd] += bytes  time_io[p, fd] <<< time}probe syscall.write.return {  p = pid()  fd = $fd  bytes = $return  time = gettimeofday_us() - @entry(gettimeofday_us())  if (bytes > 0)    filewrite[p, fd] += bytes  time_io[p, fd] <<< time}probe syscall.close {  if ([pid(), $fd] in filehandles) {    printf("%d %s access %s read: %d write: %d\n",           timestamp(), proc(), filehandles[pid(), $fd],           fileread[pid(), $fd], filewrite[pid(), $fd])    if (@count(time_io[pid(), $fd]))      printf("%d %s iotime %s time: %d\n",  timestamp(), proc(),             filehandles[pid(), $fd], @sum(time_io[pid(), $fd]))   }  delete fileread[pid(), $fd]  delete filewrite[pid(), $fd]  delete filehandles[pid(), $fd]  delete time_io[pid(),$fd]}

效果如下 [root@gx-yun-084036 io]# stap iotime.stp -c “sleep 1”

66449 28145 (sleep) access /etc/ld.so.cache read: 0 write: 066515 28145 (sleep) access /lib64/libc.so.6 read: 832 write: 066519 28145 (sleep) iotime /lib64/libc.so.6 time: 266739 28145 (sleep) access /usr/lib/locale/locale-archive read: 0 write: 0573033 2747 (zabbix_agentd) access /proc/stat read: 8191 write: 0573046 2747 (zabbix_agentd) iotime /proc/stat time: 13171034282 28148 (dmsetup) access /etc/ld.so.cache read: 0 write: 01034350 28148 (dmsetup) access /lib64/libdevmapper.so.1.02 read: 832 write: 01034355 28148 (dmsetup) iotime /lib64/libdevmapper.so.1.02 time: 21034394 28148 (dmsetup) access /lib64/librt.so.1 read: 832 write: 0

当前磁盘 IO 是主要由那个进程导致

disktop.stp

#!/usr/bin/stapglobal io_stat,deviceglobal read_bytes,write_bytesprobe vfs.read.return {  if ($return>0) {    if (devname!="N/A") {
/*skip read from cache*/ io_stat[pid(),execname(),uid(),ppid(),"R"] += $return device[pid(),execname(),uid(),ppid(),"R"] = devname read_bytes += $return } }}probe vfs.write.return { if ($return>0) { if (devname!="N/A") { /*skip update cache*/ io_stat[pid(),execname(),uid(),ppid(),"W"] += $return device[pid(),execname(),uid(),ppid(),"W"] = devname write_bytes += $return } }}probe timer.ms(5000) { /* skip non-read/write disk */ if (read_bytes+write_bytes) { printf("\n%-25s, %-8s%4dKb/sec, %-7s%6dKb, %-7s%6dKb\n\n", ctime(gettimeofday_s()), "Average:", ((read_bytes+write_bytes)/1024)/5, "Read:",read_bytes/1024, "Write:",write_bytes/1024) /* print header */ printf("%8s %8s %8s %25s %8s %4s %12s\n", "UID","PID","PPID","CMD","DEVICE","T","BYTES") } /* print top ten I/O */ foreach ([process,cmd,userid,parent,action] in io_stat- limit 10) printf("%8d %8d %8d %25s %8s %4s %12d\n", userid,process,parent,cmd, device[process,cmd,userid,parent,action], action,io_stat[process,cmd,userid,parent,action]) /* clear data */ delete io_stat delete device read_bytes = 0 write_bytes = 0}probe end{ delete io_stat delete device delete read_bytes delete write_bytes}

效果如下

[root@gx-yun-084036 io]# stap disktop.stpThu Sep 22 07:40:41 2016 , Average:415962Kb/sec, Read:      68Kb, Write: 2079746Kb         UID      PID       PPID                       CMD   DEVICE    T        BYTES             0    48463    44947                        dd     sda2    W    209715200             0    48510    44947                        dd     sda2    W    209715200             0    48511    44947                        dd     sda2    W    209715200             0    48512    44947                        dd     sda2    W    209715200             0    48513    44947                        dd     sda2    W    209715200             0    48514    44947                        dd     sda2    W    209715200             0    48515    44947                        dd     sda2    W    209715200             0    48516    44947                        dd     sda2    W    209715200             0    48517    44947                        dd     sda2    W    209715200             0    48518    44947                        dd     sda2    W    209715200

转载地址:http://llnni.baihongyu.com/

你可能感兴趣的文章
大数据入门:Zookeeper结构体系
查看>>
大数据入门:Spark RDD基础概念
查看>>
大数据入门:SparkCore开发调优原则
查看>>
大数据入门:Java和Scala编程对比
查看>>
大数据入门:Scala函数式编程
查看>>
【数据结构周周练】002顺序表与链表
查看>>
C++报错:C4700:使用了非初始化的局部变量
查看>>
【数据结构周周练】003顺序栈与链栈
查看>>
C++类、结构体、函数、变量等命名规则详解
查看>>
C++ goto语句详解
查看>>
【数据结构周周练】008 二叉树的链式创建及测试
查看>>
《软件体系结构》 第九章 软件体系结构评估
查看>>
《软件体系结构》 第十章 软件产品线体系结构
查看>>
《软件过程管理》 第六章 软件过程的项目管理
查看>>
《软件过程管理》 第九章 软件过程的评估和改进
查看>>
分治法 动态规划法 贪心法 回溯法 小结
查看>>
《软件体系结构》 练习题
查看>>
《数据库系统概论》 第一章 绪论
查看>>
《数据库系统概论》 第二章 关系数据库
查看>>
《数据库系统概论》 第三章 关系数据库标准语言SQL
查看>>