Commit d7a1fee1 authored by Dan Ehrenberg's avatar Dan Ehrenberg Committed by Theodore Ts'o
Browse files

ext4: make the preallocation size be a multiple of stripe size



Previously, if a stripe width was provided, then it would be used
as the preallocation granularity, with no santiy checking and no
way to override this. Now, mb_prealloc_size defaults to the smallest
multiple of stripe size that is greater than or equal to the old
default mb_prealloc_size, and this can be overridden with the sysfs
interface.

Signed-off-by: default avatarDan Ehrenberg <dehrenberg@google.com>
Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
parent 265c6a0f
Loading
Loading
Loading
Loading
+20 −9
Original line number Diff line number Diff line
@@ -128,12 +128,13 @@
 * we are doing a group prealloc we try to normalize the request to
 * sbi->s_mb_group_prealloc. Default value of s_mb_group_prealloc is
 * 512 blocks. This can be tuned via
 * /sys/fs/ext4/<partition/mb_group_prealloc. The value is represented in
 * /sys/fs/ext4/<partition>/mb_group_prealloc. The value is represented in
 * terms of number of blocks. If we have mounted the file system with -O
 * stripe=<value> option the group prealloc request is normalized to the
 * stripe value (sbi->s_stripe)
 * the smallest multiple of the stripe value (sbi->s_stripe) which is
 * greater than the default mb_group_prealloc.
 *
 * The regular allocator(using the buddy cache) supports few tunables.
 * The regular allocator (using the buddy cache) supports a few tunables.
 *
 * /sys/fs/ext4/<partition>/mb_min_to_scan
 * /sys/fs/ext4/<partition>/mb_max_to_scan
@@ -2474,6 +2475,18 @@ int ext4_mb_init(struct super_block *sb, int needs_recovery)
	sbi->s_mb_stream_request = MB_DEFAULT_STREAM_THRESHOLD;
	sbi->s_mb_order2_reqs = MB_DEFAULT_ORDER2_REQS;
	sbi->s_mb_group_prealloc = MB_DEFAULT_GROUP_PREALLOC;
	/*
	 * If there is a s_stripe > 1, then we set the s_mb_group_prealloc
	 * to the lowest multiple of s_stripe which is bigger than
	 * the s_mb_group_prealloc as determined above. We want
	 * the preallocation size to be an exact multiple of the
	 * RAID stripe size so that preallocations don't fragment
	 * the stripes.
	 */
	if (sbi->s_stripe > 1) {
		sbi->s_mb_group_prealloc = roundup(
			sbi->s_mb_group_prealloc, sbi->s_stripe);
	}

	sbi->s_locality_groups = alloc_percpu(struct ext4_locality_group);
	if (sbi->s_locality_groups == NULL) {
@@ -2841,8 +2854,9 @@ out_err:

/*
 * here we normalize request for locality group
 * Group request are normalized to s_strip size if we set the same via mount
 * option. If not we set it to s_mb_group_prealloc which can be configured via
 * Group request are normalized to s_mb_group_prealloc, which goes to
 * s_strip if we set the same via mount option.
 * s_mb_group_prealloc can be configured via
 * /sys/fs/ext4/<partition>/mb_group_prealloc
 *
 * XXX: should we try to preallocate more than the group has now?
@@ -2853,9 +2867,6 @@ static void ext4_mb_normalize_group_request(struct ext4_allocation_context *ac)
	struct ext4_locality_group *lg = ac->ac_lg;

	BUG_ON(lg == NULL);
	if (EXT4_SB(sb)->s_stripe)
		ac->ac_g_ex.fe_len = EXT4_SB(sb)->s_stripe;
	else
	ac->ac_g_ex.fe_len = EXT4_SB(sb)->s_mb_group_prealloc;
	mb_debug(1, "#%u: goal %u blocks for locality group\n",
		current->pid, ac->ac_g_ex.fe_len);