发现有IP对我们API进行大量的数量采集,所以写这个脚本来获取哪些IP只访问单一接口,却不访问其它接口,一般这样的行为,是异常的。
分析前端负载nginx的日志,日志格式如下:
- 114.249.4.96 - - [15/Jan/2016:23:59:47 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 48 "-" "-" "-"
- 222.128.172.215 - - [15/Jan/2016:23:59:47 +0800] "POST /api2/button_log/ HTTP/1.1" 200 48 "-" "-" "-"
- 110.72.182.177 - - [15/Jan/2016:23:59:47 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 48 "-" "-" "-"
- 58.63.7.92 - - [15/Jan/2016:23:59:48 +0800] "POST /api2/getgoodsdetail/ HTTP/1.1" 200 877 "-" "-" "-"
- 117.177.160.218 - - [15/Jan/2016:23:59:48 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 82 "-" "-" "-"
- 117.177.160.218 - - [15/Jan/2016:23:59:48 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 82 "-" "-" "-"
- 163.142.55.76 - - [15/Jan/2016:23:59:48 +0800] "POST /api2/getuserinfo/ HTTP/1.1" 200 546 "-" "-" "-"
- 114.112.89.34 - - [15/Jan/2016:23:59:48 +0800] "POST /api2/getgoodslist/ HTTP/1.1" 200 9532 "-" "-" "-"
- 58.61.225.110 - - [15/Jan/2016:23:59:49 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 82 "-" "-" "-"
- 114.244.195.163 - - [15/Jan/2016:23:59:49 +0800] "POST /api2/getgoodslist/ HTTP/1.1" 200 47834 "-" "-" "-"
- 114.244.195.163 - - [15/Jan/2016:23:59:49 +0800] "POST /api2/getgoodslist/ HTTP/1.1" 200 47834 "-" "-" "-"
- 114.112.89.34 - - [15/Jan/2016:23:59:49 +0800] "POST /api2/getgoodslist/ HTTP/1.1" 200 9532 "-" "-" "-"
- 125.39.170.239 - - [15/Jan/2016:23:59:49 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 30 "-" "-" "-"
- 110.84.169.57 - - [15/Jan/2016:23:59:50 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 48 "-" "-" "-"
- 42.81.46.142 - - [15/Jan/2016:23:59:50 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 48 "-" "-" "-"
- 110.84.169.57 - - [15/Jan/2016:23:59:50 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 82 "-" "-" "-"
- 117.136.40.148 - - [15/Jan/2016:23:59:50 +0800] "POST /api2/getgoodslist/ HTTP/1.1" 200 1024 "-" "-" "-"
- 117.12.243.251 - - [15/Jan/2016:23:59:50 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 48 "-" "-" "-"
- 117.12.243.251 - - [15/Jan/2016:23:59:50 +0800] "POST /api2/realtimetrack/ HTTP/1.1" 200 82 "-" "-" "-"
python分析代码:
- #!/usr/bin/env python
- #coding:utf8
- __author__ = '戴儒锋'
- """
- 检测nginx日志的访问IP是是否有程序来抓取接口信息
- 规则:程序分析 只访问getgoodslist 接口,而不访问其它的接口IP
- """
- log_path = '/home/logs/nginx/access.log'
- #定义IP访问每个URL的次数的空字典,如{'10.0.0.1':{'/api2/getgoodslist':15}}
- ip_info = {}
- with open(log_path,'r') as f:
- for line in f.readlines():
- #获取IP地址
- ip = line.split()[0]
- #获取访问接口URL
- url = line.split()[6]
- #如果字典里没有该IP,则添加该IP为KEY值,URL为二级字典KEY,访问数=1
- #如果有该IP但在二级字典中没有该URL,则该URL设为二级字典KEY,访问数为1
- #如果有该IP,且在二级字典中有该URL,则把该URL的值+1
- if ip not in ip_info:
- ip_info[ip] = {url:1}
- else:
- if url not in ip_info[ip]:
- ip_info[ip] = 1
- else:
- ip_info[ip] += 1
- #遍历结果,把IP只访问小于3个接口,并且访问getgoodslist接口超过100次的打印出来
- for ip,value in ip_info.items():
- if len(value) < 3 and value.get('/api2/getgoodslist/',0) > 100:
- print "IP:%s URL-COUNT:%s" %(ip,value)
分析结果:
- IP:58.63.7.92 URL-COUNT:{'/api2/getgoodsdetail/': 3383, '/api2/getgoodslist/': 550}
- IP:58.63.4.71 URL-COUNT:{'/api2/getgoodsdetail/': 4499, '/api2/getgoodslist/': 275}
- IP:118.122.120.146 URL-COUNT:{'/api2/getgoodslist/': 443}
- IP:114.244.195.163 URL-COUNT:{'/api2/getgoodslist/': 568}
- IP:124.72.23.174 URL-COUNT:{'/api2/getgoodslist/': 132}
- IP:183.30.79.59 URL-COUNT:{'/api2/getgoodslist/': 322, '/api2/realtimetrack/': 6}
- IP:61.140.50.120 URL-COUNT:{'/api2/getgoodslist/': 1402}
- IP:171.221.25.108 URL-COUNT:{'/api2/getgoodslist/': 1136}
您可以选择一种方式赞助本站
支付宝扫一扫赞助
微信钱包扫描赞助
赏