利用 sed 和 awk 做简单日志分析
环境
本实验是在 macOS 下做的,Linux 下可能会稍有不同(比如 gzcat)。
日志文件
1 | gzcat data.gz | head -n 40 # data.gz 是日志文件 |
系统输出:
May 13 00:01:58 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12513]): Could not find uid associated with service: 0: Undefined error: 0 501
May 13 00:01:58 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12513]): Service exited with abnormal code: 78
May 13 00:02:12 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.xpc.launchd.domain.pid.mdmclient.12523): Failed to bootstrap path: path = /usr/libexec/mdmclient, error = 108: Invalid path
May 13 00:04:20 BBAOMACBOOKAIR2 syslogd[113]: ASL Sender Statistics
May 13 00:05:58 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12535]): Could not find uid associated with service: 0: Undefined error: 0 501
May 13 00:05:58 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12535]): Service exited with abnormal code: 78
May 13 00:09:58 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12536]): Could not find uid associated with service: 0: Undefined error: 0 501
May 13 00:09:58 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12536]): Service exited with abnormal code: 78
May 13 00:17:59 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12555]): Could not find uid associated with service: 0: Undefined error: 0 501
May 13 00:17:59 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12555]): Service exited with abnormal code: 78
May 13 00:17:59 BBAOMACBOOKAIR2 syslogd[113]: ASL Sender Statistics
May 13 00:19:59 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12556]): Could not find uid associated with service: 0: Undefined error: 0 501
May 13 00:19:59 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12556]): Service exited with abnormal code: 78
May 13 00:21:59 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12560]): Could not find uid associated with service: 0: Undefined error: 0 501
May 13 00:21:59 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12560]): Service exited with abnormal code: 78
May 13 00:22:18 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.xpc.launchd.domain.user.914945058): Service “com.apple.xpc.launchd.unmanaged.loginwindow.594” tried to register for endpoint “com.apple.tsm.uiserver” already registered by owner: com.apple.TextInputMenuAgent
May 13 00:22:49 — last message repeated 1 time —
May 13 00:23:50 BBAOMACBOOKAIR2 timed[158]: settimeofday({0x5ebacd96,0x52ddf}) == 0
May 13 00:28:05 BBAOMACBOOKAIR2 syslogd[113]: ASL Sender Statistics
May 13 00:28:07 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.ScreenSaver.Computer-Name[12564]): Service exited due to SIGKILL | sent by Computer Name[12564]
May 13 00:28:17 BBAOMACBOOKAIR2 VTDecoderXPCService[960]: DEPRECATED USE in libdispatch client: Changing the target of a source after it has been activated; set a breakpoint on _dispatch_bug_deprecated to debug
May 13 00:28:17 BBAOMACBOOKAIR2 VTDecoderXPCService[960]: DEPRECATED USE in libdispatch client: Changing target queue hierarchy after xpc connection was activated; set a breakpoint on _dispatch_bug_deprecated to debug
May 13 00:28:18 BBAOMACBOOKAIR2 VTDecoderXPCService[960]: DEPRECATED USE in libdispatch client: Changing the target of a source after it has been activated; set a breakpoint on _dispatch_bug_deprecated to debug
May 13 00:28:18 BBAOMACBOOKAIR2 VTDecoderXPCService[960]: DEPRECATED USE in libdispatch client: Changing target queue hierarchy after xpc connection was activated; set a breakpoint on _dispatch_bug_deprecated to debug
May 13 00:28:19 BBAOMACBOOKAIR2 VTDecoderXPCService[960]: DEPRECATED USE in libdispatch client: Changing the target of a source after it has been activated; set a breakpoint on _dispatch_bug_deprecated to debug
May 13 00:28:19 BBAOMACBOOKAIR2 VTDecoderXPCService[960]: DEPRECATED USE in libdispatch client: Changing target queue hierarchy after xpc connection was activated; set a breakpoint on _dispatch_bug_deprecated to debug
May 13 00:28:20 BBAOMACBOOKAIR2 VTDecoderXPCService[960]: DEPRECATED USE in libdispatch client: Changing the target of a source after it has been activated; set a breakpoint on _dispatch_bug_deprecated to debug
May 13 00:28:20 BBAOMACBOOKAIR2 VTDecoderXPCService[960]: DEPRECATED USE in libdispatch client: Changing target queue hierarchy after xpc connection was activated; set a breakpoint on _dispatch_bug_deprecated to debug
May 13 00:28:26 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.preference.displays.MirrorDisplays): Service only ran for 9 seconds. Pushing respawn out by 1 seconds.
May 13 00:28:31 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.preference.displays.MirrorDisplays): Service only ran for 4 seconds. Pushing respawn out by 6 seconds.
May 13 00:29:49 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12610]): Could not find uid associated with service: 0: Undefined error: 0 501
May 13 00:29:49 BBAOMACBOOKAIR2 com.apple.xpc.launchd[1] (com.apple.mdworker.bundles[12610]): Service exited with abnormal code: 78
May 13 00:30:00 BBAOMACBOOKAIR2 syslogd[113]: Configuration Notice:
ASL Module “com.apple.cdscheduler” claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
May 13 00:30:00 BBAOMACBOOKAIR2 syslogd[113]: Configuration Notice:
ASL Module “com.apple.install” claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
May 13 00:30:00 BBAOMACBOOKAIR2 syslogd[113]: Configuration Notice:
ASL Module “com.apple.callhistory.asl.conf” claims selected messages.
以上日志贴出来可能看的不够清晰,其实就几点:
- 正常的日志都是日期时间开头,像“May 13 00:30:00”这样
- 不是这个开头的日志行(基本都是 \t 跳格开头),其实是上一行的继续(日志分析时要将其和上一行合并)
- 有“— last message repeated 1 time —”的日志其实内容跟上一行是完全一样的(开头的日期时间除外)
要求
分析系统日志文件(data.gz)从中得到关键信息,用 Json 的格式 POST 上传至服务器 https://foo.com/bar ),key的名称如下(在括号里):
- 设备名称: (deviceName)
- 错误的进程号码: (processId)
- 进程/服务名称: (processName)
- 错误的原因(描述)(description)
- 发生的时间(小时级),例如0100-0200,0300-0400, (timeWindow)
- 在小时级别内发生的次数 (numberOfOccurrence)
分析
这里考察的点应该就是多行日志的处理(主要是合并)问题,这个可以借助 sed 可以搞定,但还有一个重复日志的问题,也就是当日志行里有“— last message repeated 1 time —”时的处理问题,没想好怎样完美的解决。
最终方案代码
1 | gzcat data.gz | head -n 40 | sed -e '1h;2,$H;$!d;g;s/\n\t/ /g' | awk -f log_ana.awk> data.json |
其中 log_ana.awk 代码如下:
1 | #! /usr/bin/awk -f |
data.json 文件的内容可以这样看:
1 | cat data.json |
1 | [ |
这里的大概解释下:
多行日志的合并问题,这里是用 sed 命令解决的,awk 程序主要是处理了“— last message peated 1 time —”的问题以及做日志分析(从日志里抓取关键信息并做统计且生成上报数据文件)。其实这样不是特完美,毕竟日志文件循环了两遍,理论上来讲最好过一遍就处理完毕的,但是我在 awk 里实在是没想好怎样同时处理多行日志合并以及“— last message peated 1 time —”的问题。这里请各位大佬教我。