1. rgw 对象

rgw对象提供两种上传接口:整体上传和分段上传 其中,整体上传对象最大不能超过 rgw_max_put_size (默认5G)

2. 整体上传

  • rgw_max_chunk_size

    rgw向下发送的IO大小, 也是rados首对象的大小

  • rgw_obj_stripe_size

    rados中间对象的大小, 简称条带大小

整体上传时, 上传的对象对应一个rados对象, 该rados对象以原对象名命名, 原对象的元数据保存在该rados对象的扩展属性中。 当上传的对象大于首对象大小时, 将会被分解, 分解成一个首对象, 多个大小等于条带大小的中间对象, 和一个小于等于条带的尾对象。

header对象 + [ 中间对象1 + … + 中间对象n ] + tail对象 4MB 4MB 4MB <=4MB

首对象以对象名命名, 在RGW中首对象称为 headobj, 首对象(headobj)的 “数据部分” 保存 max_chunk_size 大小的数据, 首对象的扩展属性保存了 “原对象的元数据信息” 和 “manifest对象信息”。

中间对象和尾对象保存原对象剩余的数据信息,中间对象和尾对象的命名格式如下: bucket_id +”” + “_shadow” + 32bit随机字符串 + “” + 条带编号 (编号从 1开始) 例如: 8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow.sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_1

2.1 大于4MB文件的存储(整体上传)

1>. 使用s3cmd上传了一个5MB文件:test.5M.gz.clod.1009

[root@muhongtao-ceph-3 ~]# s3cmd ls s3://sensebucket
2020-10-09 09:44      5242880  s3://sensebucket/test.5M.gz.clod.1009

2>. 查看对应的 data pool, 发现包括两个对象数据: marker == bucketid

[root@ceph-3 ~]# rados -p class_hdd_pool_1.data ls
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_1    #{marker}_${object_name}
8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_test.5M.gz.clod.1009

3>. 分别查看这两个对象的stat: 使用命令查询 stat:

rados -p ${pool_name} stat ${marker}_${object_name}
[root@ceph-4 ~]# rados -p class_hdd_pool_1.data stat 8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_test.5M.gz.clod.1009
class_hdd_pool_1.data/8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1_test.5M.gz.clod.1009 mtime 2020-10-09 05:44:15.000000, size 4194304

[root@ceph-4 mht_workspace]# rados -p class_hdd_pool_1.data stat 8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_1
class_hdd_pool_1.data/8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1__shadow_.sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_1 mtime 2020-10-09 05:44:15.000000, size 1048576

两个object大小分别为 4MB、1MB.

4>. 看下bucket下整个test.5M.gz.clod.1009对象的stat

radosgw-admin object stat --bucket=sensebucket --object=test.5M.gz.clod.1009 > 5m.info
{
    "name": "test.5M.gz.clod.1009",
    "size": 5242880,
    "policy": {
        "acl": {
            .....
        },
        "owner": {
            "id": "1009-user-01",
            "display_name": "1009-user-01"
        }
    },
    "etag": "5f363e0e58a95f06cbe9bbc662c5dfb6",
    "tag": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95963.5",
    "manifest": {
        "objs": [],
        "obj_size": 5242880,
        "explicit_objs": "false",
        "head_size": 4194304,
        "max_head_size": 4194304,
        "prefix": ".sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_",
        "rules": [
            {
                "key": 0,
                "val": {
                    "start_part_num": 0,
                    "start_ofs": 4194304,
                    "part_size": 0,
                    "stripe_max_size": 4194304,
                    "override_prefix": ""
                }
            }
        ],
        "tail_instance": "",
        "tail_placement": {
            "bucket": {
                "name": "sensebucket",
                "marker": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
                "bucket_id": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
                "tenant": "",
                "explicit_placement": {
                    "data_pool": "",
                    "data_extra_pool": "",
                    "index_pool": ""
                }
            },
            "placement_rule": "zone1-placement"
        },
        "begin_iter": {
            "part_ofs": 0,
            "stripe_ofs": 0,
            "ofs": 0,
            "stripe_size": 4194304,
            "cur_part_id": 0,
            "cur_stripe": 0,
            "cur_override_prefix": "",
            "location": {
                "placement_rule": "zone1-placement",
                "obj": {
                    "bucket": {
                        "name": "sensebucket",
                        "marker": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
                        "bucket_id": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
                        "tenant": "",
                        "explicit_placement": {
                            "data_pool": "",
                            "data_extra_pool": "",
                            "index_pool": ""
                        }
                    },
                    "key": {
                        "name": "test.5M.gz.clod.1009",			# ${marker}_${begin_iter.location.obj.key.name} 即为首分片名
                        "instance": "",
                        "ns": ""
                    }
                },
                "raw_obj": {
                    "pool": "",
                    "oid": "",
                    "loc": ""
                },
                "is_raw": false
            }
        },
        "end_iter": {
            "part_ofs": 4194304,
            "stripe_ofs": 4194304,
            "ofs": 5242880,
            "stripe_size": 1048576,
            "cur_part_id": 0,
            "cur_stripe": 1,
            "cur_override_prefix": "",
            "location": {
                "placement_rule": "zone1-placement",
                "obj": {
                    "bucket": {
                        "name": "sensebucket",
                        "marker": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
                        "bucket_id": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95969.1",
                        "tenant": "",
                        "explicit_placement": {
                            "data_pool": "",
                            "data_extra_pool": "",
                            "index_pool": ""
                        }
                    },
                    "key": {
                        "name": ".sPQ50HnGJ3iXld7Dpvor7lDKsdlpbjH_1",		# ${marker}_${end_iter.location.obj.key.name} 即为首分片名
                        "instance": "",
                        "ns": "shadow"
                    }
                },
                "raw_obj": {
                    "pool": "",
                    "oid": "",
                    "loc": ""
                },
                "is_raw": false
            }
        }
    },
    "attrs": {
        "user.rgw.content_type": "application/octet-stream",
        "user.rgw.pg_ver": "",
        "user.rgw.source_zone": "`<80>n$",
        "user.rgw.storage_class": "STANDARD",
        "user.rgw.tail_tag": "8fb2def7-7ccd-4803-a76a-566554e21b9e.95963.5",
        "user.rgw.x-amz-content-sha256": "c036cbb7553a909f8b8877d4461924307f27ecb66cff928eeeafd569c3887e29",
        "user.rgw.x-amz-date": "20201009T094415Z",
        "user.rgw.x-amz-meta-s3cmd-attrs": "atime:1602215145/ctime:1600336720/gid:0/gname:root/md5:5f363e0e58a95f06cbe9bbc662c5dfb6/mode:33188/mtime:1600336720/uid:0/uname:root"
    }
}            

3. 分段上传

  • rgw_multipart_min_part_size

    针对分段上传, 客户端可以自定义上传的分段大小, 默认是5MB, 即:至少5MB

  • rgw_max_put_size

    当文件大于该值时将默认采用分段上传, 该值为5GB

3.1 分段上传数据格式

  • 分段上传一个对象的时候, rgw按照条带大小将每个分段分成多个 rados对象.

  • 每个分段的第一个rados对象的命名格式:

    bucketid + “_” + 上传的对象名 + “.” + uploadid + “.” + 分段编号

  • 每个分段的其余对象命名格式:

    bucketid + “” + 上传的对象名 + “.” + uploadid + “.” + 分段编号 + “” + 条带编号