rgw lifecycle
1. 集群环境说明
三台机器, 部署nautilus版本(关于在线部署请看另一篇文章:ceph-ansible部署nautilus版本ceph集群):
- ceph-2: 192.168.2.76 (部署 mons, mgrs, osds), (osd0, 0sd3)
- ceph-3: 192.168.2.27 (部署 mons, osds, rgws), (osd2, osd5)
- ceph-4: 192.168.2.40 (部署 mons, osds, rgws), (osd1, osd4)
2.修改crushmap
初始的crushmap如下:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.29398 root default
-3 0.09799 host ceph-2
0 hdd 0.04900 osd.0 up 1.00000 1.00000
3 hdd 0.04900 osd.3 up 1.00000 1.00000
-7 0.09799 host ceph-3
2 hdd 0.04900 osd.2 up 1.00000 1.00000
5 hdd 0.04900 osd.5 up 1.00000 1.00000
-5 0.09799 host ceph-4
1 hdd 0.04900 osd.1 up 1.00000 1.00000
4 hdd 0.04900 osd.4 up 1.00000 1.00000
需要在同一个zone中实现lifecycle,需要将三组host对应是osds进行划分,划分的规则就是:同一组主机的osd分别用于划分不同是pool。划分完后,crushmap如下:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-17 0.44696 root ch2_disk
-5 0.14899 host ceph-2-3
3 hdd 0.04900 osd.3 up 1.00000 1.00000
-13 0.14899 host ceph-3-5
5 hdd 0.04900 osd.5 up 1.00000 1.00000
-9 0.14899 host ceph-4-4
4 hdd 0.04900 osd.4 up 1.00000 1.00000
-15 0.44696 root ch1_disk
-3 0.14899 host ceph-2-0
0 hdd 0.04900 osd.0 up 1.00000 1.00000
-11 0.14899 host ceph-3-2
2 hdd 0.04900 osd.2 up 1.00000 1.00000
-7 0.14899 host ceph-4-1
1 hdd 0.04900 osd.1 up 1.00000 1.00000
具体的划分步骤如下:(目前还不会使用命令行修改, 以后会补充一份命令行的修改过程)
- 获取crush map
ceph osd getcrushmap -o crushmap
- 使用 crushtool 进行反编译 crush map
crushtool -d crushmap -o dencode_crushmap
- 修改dencode_crushmap 文件:
# buckets
#--------------------------------------------------------------#
# 将单个osd独立划分成对应的 host, 以便于下边规则添加按host进行
#--------------------------------------------------------------#
host ceph-2-0 {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
alg straw2
hash 0 # rjenkins1
item osd.0 weight 0.049
}
host ceph-2-3 {
id -5 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
alg straw2
hash 0 # rjenkins1
item osd.3 weight 0.049
}
host ceph-4-1 {
id -7 # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
alg straw2
hash 0 # rjenkins1
item osd.1 weight 0.049
}
host ceph-4-4 {
id -9 # do not change unnecessarily
id -10 class hdd # do not change unnecessarily
alg straw2
hash 0 # rjenkins1
item osd.4 weight 0.049
}
host ceph-3-2 {
id -11 # do not change unnecessarily
id -12 class hdd # do not change unnecessarily
alg straw2
hash 0 # rjenkins1
item osd.2 weight 0.049
}
host ceph-3-5 {
id -13 # do not change unnecessarily
id -14 class hdd # do not change unnecessarily
alg straw2
hash 0 # rjenkins1
item osd.5 weight 0.049
}
# root
#------------------------------------------------------------#
# 将独立的host按约定划分成两组,并为这两组创建规则 rule
#------------------------------------------------------------#
root ch1_disk {
id -15 # do not change unnecessarily
id -16 class hdd # do not change unnecessarily
alg straw2
hash 0 # rjenkins1
item muhongtao-ceph-2-0 weight 0.149
item muhongtao-ceph-3-2 weight 0.149
item muhongtao-ceph-4-1 weight 0.149
}
root ch2_disk {
id -17 # do not change unnecessarily
id -18 class hdd # do not change unnecessarily
alg straw2
hash 0 # rjenkins1
item muhongtao-ceph-2-3 weight 0.149
item muhongtao-ceph-3-5 weight 0.149
item muhongtao-ceph-4-4 weight 0.149
}
# rules
#------------------------------------------------------------#
# 为这两组root 创建规则 rule
#------------------------------------------------------------#
rule ch1_disk {
id 0
type replicated
min_size 1
max_size 10
step take ch1_disk
step chooseleaf firstn 0 type host
step emit
}
rule ch2_disk {
id 1
type replicated
min_size 1
max_size 10
step take ch2_disk
step chooseleaf firstn 0 type host
step emit
}
#end crush map
- 编译新的crush map
crushtool -c dencode_crushmap -o newcrushmap
- 将crush map注入当前集群环境
ceph osd setcrushmap -i newcrushmap
3.创建pool
删除无用的pool,并在这两个ch1_disk/ch2_disk规则之上创建新的pool
3.1删除default pool, 只剩下 .rgw.root
运行 ./clean_pool.sh default
#clean_pool.sh
#!/bin/bash
pool_prefix=$1
all_pool=`ceph osd pool ls|grep ${pool_prefix}`
echo "all_pool:${all_pool}"
for cur_pool in ${all_pool}
do
ceph osd pool delete ${cur_pool} ${cur_pool} --yes-i-really-really-mean-it
done
3.2创建两大类新的pool
创建两大类pool: class_hdd_pool_1.xxx 和 class_hdd_pool_2.xxx 其中, class_hdd_pool_1.xxx 用于 STANDARD 存储类, class_hdd_pool_2.xxx 用于 CLOD 存储类. 在生命周期服务中, transition 操作是将 存储类A 中的data pool中的数据转移至 存储类B 中的data pool 中。 而其他元数据/索引数据将共用 存储类A的。
- class_hdd_pool_1.xxx
- class_hdd_pool_1.data
- class_hdd_pool_1.index
- class_hdd_pool_1.control
- class_hdd_pool_1.meta
- class_hdd_pool_1.log
- class_hdd_pool_1.non-ec
- class_hdd_pool_2.xxx
- class_hdd_pool_2.data
- class_hdd_pool_2.index
- class_hdd_pool_2.control
- class_hdd_pool_2.meta
- class_hdd_pool_2.log
- class_hdd_pool_2.non-ec
#build_2class_pool.sh
#!/bin/bash
classpool_prefix="class_hdd_pool_"
id=1
end_id=3
while true
do
if [ ${id} -eq ${end_id} ];then
break
fi
pool_name=${classpool_prefix}${id}
#使用对应的rule
rule_name=ch${id}_disk
echo "pool_name:${pool_name} rule_name:${rule_name}"
#create pool, pg/pgp 需自行修改
ceph osd pool create ${pool_name}.data 64 64 replicated ${rule_name}
ceph osd pool create ${pool_name}.index 16 16 replicated ${rule_name}
ceph osd pool create ${pool_name}.control 8 8 replicated ${rule_name}
ceph osd pool create ${pool_name}.meta 8 8 replicated ${rule_name}
ceph osd pool create ${pool_name}.log 8 8 replicated ${rule_name}
ceph osd pool create ${pool_name}.non-ec 8 8 replicated ${rule_name}
let id=${id}+1
done
- 运行 ./build_2class_pool.sh 创建pool
- 查看pool
[root@ceph-3 storage-class]# ceph osd pool ls
.rgw.root
class_hdd_pool_1.data
class_hdd_pool_1.index
class_hdd_pool_1.control
class_hdd_pool_1.meta
class_hdd_pool_1.log
class_hdd_pool_1.non-ec
class_hdd_pool_2.data
class_hdd_pool_2.index
class_hdd_pool_2.control
class_hdd_pool_2.meta
class_hdd_pool_2.log
class_hdd_pool_2.non-ec
[root@ceph-3 storage-class]# ceph -s
cluster:
id: 7a03beb1-268b-4f5d-af43-0feabc4c7022
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-4,ceph-2,ceph-3 (age 6h)
mgr: ceph-3(active, since 17h)
osd: 6 osds: 6 up (since 2h), 6 in (since 17h)
rgw: 2 daemons active (ceph-2.rgw0, ceph-4.rgw0)
task status:
data:
pools: 13 pools, 256 pgs
objects: 12 objects, 2.1 KiB
usage: 6.3 GiB used, 294 GiB / 300 GiB avail
pgs: 256 active+clean
4. 创建realm, zg, zone
4.1 创建realm, zg, zone及zong1-placement
#delete default zg/zong
radosgw-admin zonegroup delete --rgw-zonegroup=default
radosgw-admin zone delete --rgw-zone=default
#create realm,zg,zone
radosgw-admin realm create --rgw-realm=petrel --default
radosgw-admin zonegroup create --rgw-zonegroup=petreloss --rgw-realm=petrel --master --default
radosgw-admin zone create --rgw-zone=zone1 --rgw-zonegroup=petreloss --master --default
radosgw-admin zone set --rgw-zone=zone1 --master --default --infile zone1.json
radosgw-admin period update --commit
由于这个实验是在同一个zone中的两个pool之间进行lifecycle的, 因此zone1的标准存储类别STANDARD使用的pool限定在class_hdd_pool_1中, zone1的冷存CLOD使用的pool限定在class_hdd_pool_2中。
其中zone1.json文件内容如下,
{
"name": "zone1",
"domain_root": "class_hdd_pool_1.meta:root",
"control_pool": "class_hdd_pool_1.control",
"gc_pool": "class_hdd_pool_1.log:gc",
"lc_pool": "class_hdd_pool_1.log:lc",
"log_pool": "class_hdd_pool_1.log",
"intent_log_pool": "class_hdd_pool_1.log:intent",
"usage_log_pool": "class_hdd_pool_1.log:usage",
"reshard_pool": "class_hdd_pool_1.log:reshard",
"user_keys_pool": "class_hdd_pool_1.meta:users.keys",
"user_email_pool": "class_hdd_pool_1.meta:users.email",
"user_swift_pool": "class_hdd_pool_1.meta:users.swift",
"user_uid_pool": "class_hdd_pool_1.meta:users.uid",
"system_key": {
"access_key": "",
"secret_key": ""
},
"placement_pools": [
{
"key": "zone1-placement",
"val": {
"index_pool": "class_hdd_pool_1.index",
"data_extra_pool": "class_hdd_pool_1.non-ec",
"storage-classes": {
"STANDARD": {
"data_pool": "class_hdd_pool_1.data"
},
"CLOD": {
"data_pool": "class_hdd_pool_2.data"
}
},
"index_type": 0,
"compression": "",
}
}
],
"metadata_heap": "",
"tier_config": [],
"realm_id": ""
}
问题:zon1文件中虽然指定了storage-classes.STANDARD.data_pool:class_hdd_pool_1.data, 但使用命令 radosgw-admin zone get –rgw-zone=zone1 查看时候并不会显示出 data_pool的指定pool, 这一点原因不清楚。
[root@ceph-3 storage-class]# radosgw-admin zone get
{
"id": "6d8016ed-1e4d-4a65-9106-243f3a318d06",
"name": "zone1",
.....
"placement_pools": [
{
"key": "zone1-placement",
"val": {
"index_pool": "class_hdd_pool_1.index",
"storage_classes": {
"STANDARD": { // 这里并不显示data_pool
"compression_type": ""
}
},
"data_extra_pool": "class_hdd_pool_1.non-ec",
"index_type": 0
}
}
],
....
}
由于radosgw-admin zone get –rgw-zone=zone1查看不显示data_pool,因此这里再显式的更改一下:
[root@ceph-3 storage-class]# radosgw-admin zonegroup placement add --rgw-zonegroup=petreloss --placement-id=zone1-placement
[root@ceph-3 storage-class]# radosgw-admin zone placement modify --rgw-zone=zone1 --data-pool=class_hdd_pool_1.data --placement-id=zone1-placement \
--data-pool=class_hdd_pool_1.data
问题:为什么一个 placement 必须先加入 zonegroup 后,才能再zone中添加?
- zone placement / zonegroup placement的关系是什么?
- zonegroup placement和zone placement之间的关键是:整个集合 与 子集的关系。
查看一下:
[root@ceph-3 storage-class]# radosgw-admin zone get --rgw-zone=zone1
{
"id": "6d8016ed-1e4d-4a65-9106-243f3a318d06",
"name": "zone1",
....
"placement_pools": [
{
"key": "zone1-placement",
"val": {
"index_pool": "class_hdd_pool_1.index",
"storage_classes": {
"STANDARD": {
"data_pool": "class_hdd_pool_1.data", //已经存在
"compression_type": ""
}
},
"data_extra_pool": "class_hdd_pool_1.non-ec",
"index_type": 0
}
}
],
....
}
此时我们已经将使用class_hdd_pool_1.xxx的placement创建完了。
4.2 在zone1-placement中添加CLOD
#get zg
radosgw-admin zonegroup get --rgw-zonegroup=petreloss > zonegroup.json
vim zonegroup.json
在storage_classes中添加"CLOD":
{
...
"placement_targets": [
{
"name": "zone1-placement",
"tags": [],
"storage_classes": [
"CLOD",
"STANDARD"
]
}
],
...
}
#set
radosgw-admin zonegroup set --rgw-zonegroup=petreloss --infile=zonegroup.json
#get zone
radosgw-admin zone get --rgw-zone=zone1 > zone.json
vim zone.json
在storage_classes中添加"CLOD"相关信息:
{
......
"placement_pools": [
{
"key": "zone1-placement",
"val": {
"index_pool": "class_hdd_pool_1.index",
"storage_classes": {
"CLOD": {
"data_pool": "class_hdd_pool_2.data"
},
"STANDARD": {
"data_pool": "class_hdd_pool_1.data",
"compression_type": ""
}
},
"data_extra_pool": "class_hdd_pool_1.non-ec",
"index_type": 0
}
}
],
......
}
#set
radosgw-admin zone set --rgw-zone=zone1 --infile=zone.json
查看一下:
"placement_pools": [
{
"key": "zone1-placement",
"val": {
"index_pool": "class_hdd_pool_1.index",
"storage_classes": {
"CLOD": {
"data_pool": "class_hdd_pool_2.data"
},
"STANDARD": {
"data_pool": "class_hdd_pool_1.data",
"compression_type": ""
}
},
"data_extra_pool": "class_hdd_pool_1.non-ec",
"index_type": 0
}
}
],
至此,4步骤做完。
5. 创建user, bucket
5.1 创建user
radosgw-admin user create --uid="user123" --display-name="user123"
- 问题: 创建用户时候如果不指定 –placement-id 字段,则该用户将会使用zonegroup中默认的placement作为placement target. 这时候如果zg中的default-placement被删除,那么再创建user时候并不会报错,但是在使用该用户创建bucket时候会报错:ERROR: S3 error: 400 (InvalidLocationConstraint)。查看rgw日志,发现在zonegroup中找不到 default-placement。
- 解决办法:创建用户时候指定 –placement-id字段,如:–placement-id=zone1-placement.
上述解决方法是官方给的,我测试下貌似不管用:
[root@ceph-3 s3cmd]# radosgw-admin user modify --uid=user123 --placement-id=zone1-placement
{
"user_id": "user123",
"display_name": "user123",
...
"default_placement": "", //并未填充进去
...
}
改进方法:
radosgw-admin metadata get user:user123 > user.md.json
vim user.md.json //将default_placement字段填上保存退出即可
radosgw-admin metadata put user:user123 < user.md.json
再次查看default_placement就有了。
- 问题: 如果你使用了含有default-placement的user创建了bucket:bucket-01, 但是由于种种原因, 你删掉了default-placement,并重新为user指定默认的placement,比如上边所属的:zone1-placement。那么bucket-01之后既不能进行上传文件,也不能被删除,因为系统会去查找之前与该user对应的默认placement,结果找不到,就会报错:ERROR: S3 error: 400 (InvalidArgument)。
查看rgw日志,显示:
NOTICE: invalid dest placement: default-placement
init_permissions on sensebucket[83166252-1bf5-43e2-80f7-bd3e40c0edf3.96363.1] failed, ret=-22
5.2 创建bucket
使用s3cmd进行bucket创建:
s3cmd mb s3://sensebucket
6. 修改policy
使用python中的 boto3 sdk 进行lifecycle的操作:
- 安装 boto3
pip install boto3
- 修改rgw的配置文件
vim /etc/ceph/ceph.conf
在rgw部分添加:
rgw_lifecycle_work_time = "00:00-24:00"
rgw_lc_debug_interval = -10
lifecycle 相关参数:
- rgw_lifecycle_work_time = “00:00-6:00” //执行lc时间窗口
- rgw_enable_lc_threads = true //允许启动lc线程,设置false表示关闭lc功能
- rgw_lc_lock_max_time = 60 //某个lc线程每次可以执行的总时间,超过该时间没执行完,就等下次执行
- rgw_lc_max_objs = 32 //lc rados对象个数
- rgw_lc_max_rules = 1000 //一个bucket可以设置的rule数
- rgw_lc_debug_interval = -10 //该配置主要为了方便调试lc。这个参数很关键,>0时,会忽略设置的时间窗口去执行,立即执行,并且此时设置的过期天数,1天等于1s,也就是说你设置7天后过期,此时表示7s后过期。<=0时,则按照正常的来执行。
- 重启rgw
- 编写 boto3的修改lifecycle的文件rgw_lifecycle_setup.py:
#!/usr/bin/env python2.7
#-*- coding: utf-8 -*-
import boto3
from botocore.client import Config
import datetime
if __name__ == "__main__":
endpoint = "http://192.168.2.27:80" #替换成你自己的endpoint
access_key = "HWY7QPBEBBEJB238T465" #替换
secret_key = "VMy2BV0SHSnNZtuDrwPRrtK3DffCZ9IOBUPdVh42" #替换
bucket = "sensebucket" #替换
s3 = boto3.client('s3',
endpoint_url=endpoint,
aws_access_key_id=access_key,
aws_secret_access_key=secret_key,
)
s3.put_bucket_lifecycle(
Bucket=bucket,
LifecycleConfiguration={
'Rules': [
{
'Status': 'Enabled',
'Prefix': '',
'Expiration':
{
'Days': 1
},
'ID': 'Unique_identifier_for_the_rule' #任何唯一的可作为ID的字符串即可
}
],
}
)
print s3.get_bucket_lifecycle(Bucket=bucket)
- 运行 ./rgw_lifecycle_setup.py
{u'Rules': [{u'Status': 'Enabled', u'Prefix': '', u'Expiration': {u'Days': 1}, u'ID': 'Unique_identifier_for_the_rule'}], 'ResponseMetadata': {'HTTPStatusCode': 200, 'RetryAttempts': 0, 'HostId': '', 'RequestId': 'tx000000000000000000014-005f645282-1798e-zone1', 'HTTPHeaders': {'date': 'Fri, 18 Sep 2020 06:24:02 GMT', 'content-length': '267', 'x-amz-request-id': 'tx000000000000000000014-005f645282-1798e-zone1', 'content-type': 'application/xml', 'connection': 'Keep-Alive'}}}
- 查看下:radosgw-admin lc list
[root@ceph-3 s3cmd]# radosgw-admin lc list
[
{
"bucket": ":sensebucket:83166252-1bf5-43e2-80f7-bd3e40c0edf3.96363.2",
"status": "COMPLETE"
}
]
等待一天后看看是否该bucket的文件被删除了。(目前ceph不支持lifecycle的transition,后续实现并补充一篇文章,到时候再来完善本文章。)
- 记录下本bucket下的文件:
[root@ceph-3 s3cmd]# s3cmd --recursive ls s3://sensebucket/
2020-09-18 06:08 5242880 s3://sensebucket/lifecycle_5MB.gz
2020-09-18 05:48 1233 s3://sensebucket/user.md.json
7. 验证生效
7.1 lifecycle的过期删除验证:
- 首先查一下结果看是否删掉:
[root@ceph-3 s3cmd]# s3cmd --recursive ls s3://sensebucket/
[root@ceph-3 s3cmd] //空
可见sensebucket中的文件已经被删除。
- 通过查询pool中的信息来从另一个角度证明该bucket的文件被删除
# 首先需要得到该bucket的id
[root@ceph-3 ~]# radosgw-admin bucket stats --bucket=sensebucket
{
"bucket": "sensebucket",
"num_shards": 0,
"tenant": "",
"zonegroup": "dc2a72aa-5db5-417c-9af0-1a8b6428d06a",
......
"id": "83166252-1bf5-43e2-80f7-bd3e40c0edf3.96363.1",
"marker": "83166252-1bf5-43e2-80f7-bd3e40c0edf3.96363.1",
"owner": "user123",
.......
}
# 去索引池查找该bucket id对应的信息
[root@ceph-3 ~]# rados -p class_hdd_pool_1.index ls - |grep 83166252-1bf5-43e2-80f7-bd3e40c0edf3.96363.1
.dir.83166252-1bf5-43e2-80f7-bd3e40c0edf3.96363.1
# 使用listomapkeys 过滤
[root@ceph-3 ~]# rados -p class_hdd_pool_1.index listomapkeys .dir.83166252-1bf5-43e2-80f7-bd3e40c0edf3.96363.1
[root@ceph-3 ~]# //空
# 直接查看data pool
[root@ceph-3 ~]# rados -p class_hdd_pool_1.data ls
[root@ceph-3 ~]# //空
8 lifecycle 之 transition
rgw lc中的transition是作用范围是:同一个placement下的storageclass之间。 由于之前标题4已经将zonegroup, zone上添加了 CLOD 的相关信息以及rgw的配置文件也添加了对应的lifecycle的配置。所以这里不需要做这方面操作。
8.1 boto3 lifecycle配置
LifecycleConfiguration={
'Rules': [
{
'Status': 'Enabled',
'Prefix': 'transclod-',
'Transition':
{
'Days': 1,
'StorageClass': 'CLOD'
},
'ID': 'sensebucket_id_0987654321' #uuid
}
],
}
执行该脚本,response:
{u'Rules': [{u'Status': 'Enabled', u'Prefix': 'transclod-', u'Transition': {u'Days': 1, u'StorageClass': 'CLOD'}, u'ID': 'sensebucket_id_0987654321'}], 'ResponseMetadata': {'HTTPStatusCode': 200, 'RetryAttempts': 0, 'HostId': '', 'RequestId': 'tx000000000000000000007-005f8031b2-176db-zone1', 'HTTPHeaders': {'date': 'Fri, 19 Sep 2020 09:47:30 GMT', 'content-length': '305', 'x-amz-request-id': 'tx000000000000000000007-005f8031b2-176db-zone1', 'content-type': 'application/xml', 'connection': 'Keep-Alive'}}}
8.2 测试
注意上边的Rules,Prefix是”transclod-“
我先上传一个不含有前缀的文件:
[root@ceph-3 ~]# s3cmd put test.5M.gz s3://sensebucket/test.5M.gz.clod.1009
upload: 'test.5M.gz' -> 's3://sensebucket/test.5M.gz.clod.1009' [1 of 1]
5242880 of 5242880 100% in 0s 19.43 MB/s done
分别查看 class_hdd_pool_1.data 和 class_hdd_pool_2.data, 发现 test.5M.gz.clod.1009 并没有被进行 transition。
[root@ceph-3 storage-class]# rados -p class_hdd_pool_1.data ls
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_1
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_test.5M.gz.clod.1009
[root@ceph-3 storage-class]#
[root@ceph-3 storage-class]# rados -p class_hdd_pool_2.data ls
[root@ceph-3 storage-class]#
在上传一个小于4MB的文件, 含有前缀 ‘transclod-‘:
[root@ceph-3 s3cmd]# s3cmd put user.md.json s3://sensebucket/transclod-user.md.json.1009
upload: 'user.md.json' -> 's3://sensebucket/transclod-user.md.json.1009' [1 of 1]
1233 of 1233 100% in 0s 21.40 KB/s done
分别查看 class_hdd_pool_1.data 和 class_hdd_pool_2.data:
[root@ceph-3 storage-class]# rados -p class_hdd_pool_1.data ls
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_transclod-user.md.json.1009 #执行了transition
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_1
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_test.5M.gz.clod.1009
[root@ceph-3 storage-class]#
[root@ceph-3 storage-class]# rados -p class_hdd_pool_2.data ls
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.vgXJ_TO-bsHFcqVWG8p6mvtc_ykVv-w_0
[root@ceph-3 storage-class]#
同样做了大于4MB的测试, 结果是也会执行 transition.
执行了lifecycle之后, 为什么 class_hdd_pool_1.data 中依然有对象数据呢?我们查看下这些对象的stat
rados -p class_hdd_pool_1.data stat 8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_transclod-user.md.json.1009
class_hdd_pool_1.data/8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_transclod-user.md.json.1009 mtime 2020-10-09 05:50:02.000000, size 0
发现执行 transition的对象 size=0. 现在将该文件 transclod-user.md.json.1009 的stat 整个读取出来分析下:
radosgw-admin object stat --bucket=sensebucket --object=transclod-user.md.json.1009 > transclod-user.info
transclod-user.info文件内容如下:
{
"name": "transclod-user.md.json.1009",
"size": 1233,
"policy": {
"acl": {
.....
},
"owner": {
"id": "1009-user-01",
"display_name": "1009-user-01"
}
},
"etag": "d29d1ce22651886745d53560ffa097cb",
"tag": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95963.9",
"manifest": {
"objs": [],
"obj_size": 1233,
"explicit_objs": "false",
"head_size": 0,
"max_head_size": 0,
"prefix": ".vgXJ_TO-bsHFcqVWG8p6mvtc_ykVv-w_",
"rules": [
{
"key": 0,
"val": {
"start_part_num": 0,
"start_ofs": 0,
"part_size": 0,
"stripe_max_size": 4194304,
"override_prefix": ""
}
}
],
"tail_instance": "",
"tail_placement": {
"bucket": {
"name": "sensebucket",
"marker": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
"bucket_id": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
"tenant": "",
"explicit_placement": {
"data_pool": "",
"data_extra_pool": "",
"index_pool": ""
}
},
"placement_rule": "zone1-placement/CLOD"
},
"begin_iter": { #begin_iter 和 end_iter 完全一样, 并且 {manifest.obj_size} == 1233 < 4MB, 故只有一个分片
"part_ofs": 0,
"stripe_ofs": 0,
"ofs": 0,
"stripe_size": 1233,
"cur_part_id": 0,
"cur_stripe": 0,
"cur_override_prefix": "",
"location": {
"placement_rule": "zone1-placement/CLOD", #当前分片存的存放规则
"obj": {
"bucket": {
"name": "sensebucket",
"marker": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
"bucket_id": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
"tenant": "",
"explicit_placement": {
"data_pool": "",
"data_extra_pool": "",
"index_pool": ""
}
},
"key": {
"name": ".vgXJ_TO-bsHFcqVWG8p6mvtc_ykVv-w_0", # ${marker}_${begin_iter.location.obj.key.name} 即为首分片名
"instance": "",
"ns": "shadow"
}
},
"raw_obj": {
"pool": "",
"oid": "",
"loc": ""
},
"is_raw": false
}
},
"end_iter": {
"part_ofs": 0,
"stripe_ofs": 0,
"ofs": 1233,
"stripe_size": 1233,
"cur_part_id": 0,
"cur_stripe": 0,
"cur_override_prefix": "",
"location": {
"placement_rule": "zone1-placement/CLOD",
"obj": {
"bucket": {
"name": "sensebucket",
"marker": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
"bucket_id": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
"tenant": "",
"explicit_placement": {
"data_pool": "",
"data_extra_pool": "",
"index_pool": ""
}
},
"key": {
"name": ".vgXJ_TO-bsHFcqVWG8p6mvtc_ykVv-w_0",
"instance": "",
"ns": "shadow"
}
},
"raw_obj": {
"pool": "",
"oid": "",
"loc": ""
},
"is_raw": false
}
}
},
"attrs": {
"user.rgw.content_type": "text/plain",
"user.rgw.pg_ver": "",
"user.rgw.source_zone": "`<80>n$",
"user.rgw.storage_class": "CLOD",
"user.rgw.tail_tag": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95963.9",
"user.rgw.x-amz-content-sha256": "eace10acc865aedde72fc44a1cdcef24bb6203a149a1d6db268651ddfb930e35",
"user.rgw.x-amz-date": "20201009T095002Z",
"user.rgw.x-amz-meta-s3cmd-attrs": "atime:1602230611/ctime:1600398602/gid:0/gname:root/md5:d29d1ce22651886745d53560ffa097cb/mode:33188/mtime:1600398602/uid:0/uname:root"
}
}
9 另一种 transition 解析
上述的 transition 是从 默认的storage_class=STANDARD 转移到storage_class=CLOD的,对象的header数据存放在STANDARD所在的pool中, 这里的data size=0, 数据信息被转移到 CLOD所在的pool中。
本实验是将数据直接上传至 CLOD中, 然后制定lc策略, 将数据转移到 STANDARD中, 然后分析下结果。
1>. 先看下现有的池子中的数据情况:
[root@ceph-3 ~]# rados -p class_hdd_pool_1.data ls
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_transclod-user.md.json.1009
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_transclod-5mb.1009
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_transclod-test.1009
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_1
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_test.5M.gz.clod.1009
[root@ceph-3 ~]# rados -p class_hdd_pool_2.data ls
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.vgXJ_TO-bsHFcqVWG8p6mvtc_ykVv-w_0
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.6LI53GKdlVrAd_bv8wvPBmv9OyYzbZg_0
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.cE4sjjs_HPFC4d19llRRYkz-mBhN2V1_1
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.cE4sjjs_HPFC4d19llRRYkz-mBhN2V1_0
2>. 使用s3cmd上传一个新文件到 CLOD中:
[root@ceph-3 ~]# s3cmd put zone1.bak s3://bucket2/uptransfer-zone1.bak.1015 --storage-class=CLOD
upload: 'zone1.bak' -> 's3://bucket2/uptransfer-zone1.bak.1015' [1 of 1]
1846 of 1846 100% in 0s 55.20 KB/s done
3>. 查看数据池中的情况:
[root@ceph-3 ~]# rados -p class_hdd_pool_1.data ls
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_transclod-user.md.json.1009
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_transclod-5mb.1009
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_transclod-test.1009
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2_uptransfer-zone1.bak.1015 // 新上传的文件, size = 0
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_1
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_test.5M.gz.clod.1009
[root@ceph-3 ~]#
rados -p class_hdd_pool_1.data stat 8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2_uptransfer-zone1.bak.1015
class_hdd_pool_1.data/8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2_uptransfer-zone1.bak.1015 mtime 2020-10-14 22:53:56.000000, size 0
[root@ceph-3 ~]# rados -p class_hdd_pool_2.data ls
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.vgXJ_TO-bsHFcqVWG8p6mvtc_ykVv-w_0
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.6LI53GKdlVrAd_bv8wvPBmv9OyYzbZg_0
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2__shadow_.xlPy_gdmzIsI2C9nwhZCM9d9suP4mnf_0 //uptransfer-zone1.bak.1015
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.cE4sjjs_HPFC4d19llRRYkz-mBhN2V1_1
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.cE4sjjs_HPFC4d19llRRYkz-mBhN2V1_0
4>. 为该bucket制定 lc 策略:
'Rules': [
{
'Status': 'Enabled',
'Prefix': 'uptransfer-',
'Transition':
{
'Days': 1,
'StorageClass': 'STANDARD'
},
'ID': 'transbucket_id_0987654321_uptransfer_bucket2'
}
]
5>. 执行lc策略脚本:
[root@ceph-3 uplayer_transition]# ./rgw_lifecycle_setup.py
{u'Rules': [{u'Status': 'Enabled', u'Prefix': 'uptransfer-', u'Transition': {u'Days': 1, u'StorageClass': 'STANDARD'}, u'ID': 'transbucket_id_0987654321_uptransfer_bucket2'}], 'ResponseMetadata': {'HTTPStatusCode': 200, 'RetryAttempts': 0, 'HostId': '', 'RequestId': 'tx000000000000000000020-005f87bafe-176db-zone1', 'HTTPHeaders': {'date': 'Thu, 15 Oct 2020 02:59:10 GMT', 'content-length': '329', 'x-amz-request-id': 'tx000000000000000000020-005f87bafe-176db-zone1', 'content-type': 'application/xml', 'connection': 'Keep-Alive'}}}
6>. 查看两个数据池是否发生变化
[root@ceph-3 ~]# rados -p class_hdd_pool_1.data stat 8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2_uptransfer-zone1.bak.1015
class_hdd_pool_1.data/8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2_uptransfer-zone1.bak.1015 mtime 2020-10-14 22:53:56.000000, size 1846
//此时data2中的数据依然存在
[root@ceph-3 ~]# rados -p class_hdd_pool_2.data stat 8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2__shadow_.xlPy_gdmzIsI2C9nwhZCM9d9suP4mnf_0
class_hdd_pool_2.data/8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2__shadow_.xlPy_gdmzIsI2C9nwhZCM9d9suP4mnf_0 mtime 2020-10-14 22:53:56.000000, size 1846
可以看到, 数据已经转移到 STANDARD 上了.
7>. CLOD上的数据会不会被删除?
查看gc 列表:
//查看gc任务:
[root@ceph-3 ~]# radosgw-admin gc list
[
{
"tag": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95963.30\u0000",
"time": "2020-10-15 00:59:32.0.926761s",
"objs": [
{
"pool": "class_hdd_pool_2.data",
"oid": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2__shadow_.xlPy_gdmzIsI2C9nwhZCM9d9suP4mnf_0", //待删除的gc_oid
"key": "",
"instance": ""
}
]
}
]
发现, pool_2中的该对象已经被添加进了gc列表, 意思就是该对象会被异步删除掉。
8>. 查看rgw_gc参数:
//查看rgw_gc参数
ceph --admin-daemon ceph-client.rgw.ceph-2.rgw0.28965.93983884214936.asok config show|grep gc
"rgw_gc_max_concurrent_io": "10",
"rgw_gc_max_objs": "32", //gc hint obj个数
"rgw_gc_max_trim_chunk": "16",
"rgw_gc_obj_min_wait": "7200", //对象被gc回收前最少等待时间:2小时
"rgw_gc_processor_max_time": "3600", //gc hint obj的超时时间
"rgw_gc_processor_period": "3600", //gc线程运行周期
"rgw_nfs_max_gc": "300",
"rgw_objexp_gc_interval": "600",
等两小时在看下gc list 以及 pool_2的数据显示。
9>. 经过2小时后,查看gc list以及 pool_2数据显示.
[root@ceph-3 ~]# radosgw-admin gc list //空了
[]
[root@ceph-3 ~]# rados -p class_hdd_pool_1.data stat 8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2_uptransfer-zone1.bak.1015
class_hdd_pool_1.data/8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.2_uptransfer-zone1.bak.1015 mtime 2020-10-14 22:53:56.000000, size 1846
[root@ceph-3 ~]# rados -p class_hdd_pool_2.data ls
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.vgXJ_TO-bsHFcqVWG8p6mvtc_ykVv-w_0
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.6LI53GKdlVrAd_bv8wvPBmv9OyYzbZg_0
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.cE4sjjs_HPFC4d19llRRYkz-mBhN2V1_1
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.cE4sjjs_HPFC4d19llRRYkz-mBhN2V1_0
可以看到 gc list 已经为空。 pool_2中的数据:shadow.xlPy_gdmzIsI2C9nwhZCM9d9suP4mnf_0 已经被删除。 至此,数据由 CLOD 彻底的转移到了 STANDARD 中。