2016-08-02 | Adam Boliński

Advanced Format 4k – Redo Log Performance

Advanced Format 4k – Redo Log Performance

 

This is my first post regarding 4k Native Block Size , I will try to investigate 4k Redo Log performance  improvement vs 512 byte native storage block size.  We will compare the same workload on the same DB Appliance (this is not ODA) using two Intel P3700 drives in uATX form factor (max 4 NVMe Disks).  Workload will be generated using Kevin Closson SLOB (many thanks to Kevin for this tool)  and HammerDB.

Similar (but not the same) test was done by my friend Franck Pachot which you can find under this link : http://blog.dbi-services.com/redo-log-block-size-on-oda

In each scenario redo log was located on +REDO disk group in External Redundancy in one of two P3700 NVMe disk , on second NVMe External Redundancy  disk group called +DATA was located DB data files.

Beginning of this tests we must know , that only supported use of Native 4k block size storage  must be done thought ASMLib, you must know that is not possible to write on 4k Disk Group bypass ASMLib, so please do not try to write on /dev/oracleasm/disks , this will fail, please note you can create 4k disk group omitting ASMLib but you will fail to write on such created  4k DG.

Of course we can create  redo log file  on native 4k block disk using emulated mode 512e , which will emulate 512 byte logical block size for our database , we can setup using “_disk_sector_size_override” to TRUE , but in our case I do not want to test in emulated mode but in real native mode.

Back to our test , the data was populated by SLOB/HammerDB , we only need to fire our first test on native 512 byte block sector.

Let’s start to prepare storage block size in 512 byte:

isdct start -intelssd 0 -NVMeFormat LBAFormat=0 SecureEraseSetting=2 ProtectionInformation=0 MetaDataSetting=0
isdct start -intelssd 1 -NVMeFormat LBAFormat=0 SecureEraseSetting=2 ProtectionInformation=0 MetaDataSetting=0

So now we have Native 512 byte block size on our storage devices …

Let’s create two Disk Group +DATA and +REDO in External Redundancy using ASMLib to ensure that two testing scenerios will be the same.

NAME    BLOCK_SIZE
----------------------------------------
REDO     512
DATA     512

So lets start slob workload

NOTIFY : 2016.08.01-11:06:15 : Testing SYSDBA connectivity to the instance to validate slob.conf settings.
NOTIFY : 2016.08.01-11:06:15 : Testing connectivity. Command: "sqlplus -L sys/sys@slob as sysdba"
NOTIFY : 2016.08.01-11:06:15 : Connections will rotor from @slob1 through @slob2
NOTIFY : 2016.08.01-11:06:15 : Testing connectivity. Command: "sqlplus -L user1/user1@slob1"
NOTIFY : 2016.08.01-11:06:15 : Testing connectivity. Command: "sqlplus -L user32/user32@slob1"
NOTIFY : 2016.08.01-11:06:15 : Testing connectivity. Command: "sqlplus -L user1/user1@slob2"
NOTIFY : 2016.08.01-11:06:15 : Testing connectivity. Command: "sqlplus -L user32/user32@slob2"
NOTIFY : 2016.08.01-11:06:16 : Performing redo log switch.
NOTIFY : 2016.08.01-11:06:17 : Redo log switch complete.
NOTIFY : 2016.08.01-11:06:17 : Setting up trigger mechanism.
NOTIFY : 2016.08.01-11:06:27 : Running iostat, vmstat and mpstat on current host--in background.
NOTIFY : 2016.08.01-11:06:27 : Connecting 1 sessions to 32 schema(s) ...
NOTIFY : 2016.08.01-11:06:28 :
NOTIFY : 2016.08.01-11:06:28 : Pausing for 5 seconds before triggering the test.
NOTIFY : 2016.08.01-11:06:33 : Executing AWR "before snap" procedure. Connect string is "sqlplus -S -L sys/sys@slob as sysdba"
NOTIFY : 2016.08.01-11:06:34 :
NOTIFY : 2016.08.01-11:06:34 : Triggering the test.
NOTIFY : 2016.08.01-11:06:34 : List of monitored sqlplus PIDs written to /tmp/.7174_slob_pids.out
NOTIFY : 2016.08.01-11:06:39 : Waiting for 292 seconds before monitoring running processes (for exit).
NOTIFY : 2016.08.01-11:11:31 : Entering process monitoring loop.
NOTIFY : 2016.08.01-11:11:35 : Run time in seconds was:  301
NOTIFY : 2016.08.01-11:11:35 : Executing AWR "after snap" procedure. Connect string is "sqlplus -S -L sys/sys@slob as sysdba"
NOTIFY : 2016.08.01-11:11:36 : Generating AWR reports. HTML reports will be compressed. Connect string is "sqlplus -L sys/sys@slob as sysdba"
NOTIFY : 2016.08.01-11:11:37 : Terminating background data collectors.
./runit.sh: line 1057:  3789 Killed                  ( iostat -xm 3 > iostat.out 2>&1 )
./runit.sh: line 1057:  3790 Killed                  ( vmstat 3 > vmstat.out 2>&1 )
./runit.sh: line 1057:  3791 Killed                  ( mpstat -P ALL 3 > mpstat.out 2>&1 )
NOTIFY : 2016.08.01-11:11:45 :
NOTIFY : 2016.08.01-11:11:45 : SLOB test is complete.

Before I will present the results of my test you must note that this test is not going to present maximum performance for redo logs or NVMe drives, in other words we will compare results in real simulated production environment.

Note in one of the test I achieved maximum  performance for Redo Logs as follows :

Per Second

Per Transaction

Per Exec

Redo size (bytes):

507,914,043.7

5,750.6

 

So let’s investigate test results for 512 byte Native block size

Load Profile 512 Native Block Size
Per Second Per Transaction Per Exec Per Call
DB Time(s): 31.9 0.2 0.15 4.07
DB CPU(s): 1.4 0.0 0.01 0.18
Background CPU(s): 0.4 0.0 0.00 0.00
Redo size (bytes): 217,863,651.9 1,484,687.2
Logical read (blocks): 87,665.4 597.4
Block changes: 114,220.1 778.4
 Wait Event – log file parallel write 512 byte
Event Waits %Time -outs Total Wait Time (s) Avg wait (ms) Waits /txn % bg time
log file parallel write 37,478 0 134 3.57 0.85 18.62
Wait Event  histogram – log file parallel write 512 byte
% of Waits
Event Total Waits <1ms <2ms <4ms <8ms <16ms <32ms <=1s >1s
log file parallel write 37.5K 19.6 24.4 33.4 19.2 2.4 .2 .7

To generate some contention and measure log_file_sync I just used HammerDB

Wait Event – log file sync 512 byte
Event Waits Total Wait Time (sec) Wait Avg(ms) % DB time Wait Class
log file sync 17,018,243 41.8K 2.45 38,1 Commit

Ok so now let’s compare in the same environment and test scenario 4k redo log performance

Let’s start to prepare storage block size in 4k byte for Redo Logs :

isdct start -intelssd 0 -NVMeFormat LBAFormat=0 SecureEraseSetting=2 ProtectionInformation=0 MetaDataSetting=0
isdct start -intelssd 1 -NVMeFormat LBAFormat=3 SecureEraseSetting=2 ProtectionInformation=0 MetaDataSetting=0
NAME    BLOCK_SIZE
----------------------------------------
REDO     4096
DATA     512
Load Profile 4k Native Block Size
Per Second Per Transaction Per Exec Per Call
DB Time(s): 31.8 0.2 0.05 21.35
DB CPU(s): 3.0 0.0 0.00 2.03
Background CPU(s): 0.5 0.0 0.00 0.00
Redo size (bytes): 258,528,798.2 1,737,881.6
Logical read (blocks): 174,866.2 1,175.5
Wait Event – log file parallel write 4k
Event Waits %Time -outs Total Wait Time (s) Avg wait (ms) Waits /txn % bg time
log file parallel write 70,249 0 62 0.89 1.56 14.23
Wait Event  histogram – log file parallel write 4k
% of Waits
Event Total Waits <1ms <2ms <4ms <8ms <16ms <32ms <=1s >1s
log file parallel write 70.2K 64.3 33.4 1.9 .3 .1 .1
Wait Event – log file sync 4k
Event Waits Total Wait Time (sec) Wait Avg(ms) % DB time Wait Class
log file sync 20,127,143 22.9K 1.13 39,1 Commit

 

Results Conclusion 

As you can see using 4k give us much lower  latency and also we have more bandwidth but in case when our storage is NVMe disks is not a big  deal.  In my opinion you can think about change default 512 byte Redo Log DG to 4k in case when Redo Log is main bottleneck in your production databases and truly I saw in my life a lot „special” production configuration/application where this can be very usefull.

Next To Do

This is one of the first post regarding 4k Advanced Format and I will present much more results in few days  regarding overall DB performance 512 byte vs 4k , number of transactions etc.