I think the binlog_group_commit_sync_delay=1000 is not what we want, and it's why sync_binlog=0 has such a big impact on replica throughput.
From the MySQL docs:
Controls how many microseconds the binary log commit waits before synchronizing the binary log file to disk. By default binlog_group_commit_sync_delay is set to 0, meaning that there is no delay. Setting binlog_group_commit_sync_delay to a microsecond delay enables more transactions to be synchronized together to disk at once, reducing the overall time to commit a group of transactions because the larger groups require fewer time units per group.
Setting binlog_group_commit_sync_delay can increase the number of parallel committing transactions on any server that has (or might have after a failover) a replica, and therefore can increase parallel execution on the replicas.
This was set to 1000 (1 millis) a few weeks ago by myself in an attempt to increase parallel execution on the replicas. It didn't have a negative impact on query duration