RegionServer Group based WAL creation & replication

Post by Wellington Chevreuil
one thing that comes to my mind is that most of the time,
within a normal functioning hdfs as the file system, WAL files blocks
would already be located on nodes of the given RS group due data
locality

The primary node hosting the WAL blocks would be the same as the region
server. But, would the secondary and tertiary servers for the WALs also be
within the same RS Group by default? From the code, I don't see any hints
passed to HDFS during WAL output stream create to indicate this.

--
Nikhil Bafna | 8095234263

On Mon, Dec 10, 2018 at 4:59 PM Wellington Chevreuil <

Post by Wellington Chevreuil
Hi Nikhil, yeah, jira would be more suitable for discussions involving
code proposals, with patch reviews.
Thinking on the tradeoff from benefits versus impacts/risks of these
changes, one thing that comes to my mind is that most of the time,
within a normal functioning hdfs as the file system, WAL files blocks
would already be located on nodes of the given RS group due data
locality. So would you feel is it still relevant the given
refactoring?
As a side note, you might want to focus on branch 2 code base for new
features such as this, since there's been discussion about targeting
only bug fixes for branch-1, as version 1 approaches EOL.Em seg, 10 de

I'm looking at extending HBASE-6721 to apply it to WALs such that WALs

are

created & replicated within an RSGroup. This extends multi-tenancy to

WALs

also, and not just cover Hbase data. I was working out of 1.2.x code.
The approach I'm using is
- Strategy interface for WAL placement on the filesystem. Default to
delegate it to respective filesystem (which is the old behavior).
FavoredNode strategy computes the favoured nodes from the RSGroup
memberships.
- FavoredNode strategy requires instance of hbase.Server, to get the
current server name and an instance of Zookeeper watch to listen for
changes to RSGroup memberships
- The strategy is initialised in HRegionServer init and set in a static
field in DefaultWALProvider
- DefaultWALProvider.Writer takes the strategy in its init, and invokes

before output stream creation and passes the favored nodes information
to DistributedFileSystem.create()
Few questions
- Any glaring miss in the approach?
- I have hesitation in setting the strategy in static field in
DefaultWALProvider. I would have preferred it to be passed in "init"
itself, but that change seems to be too expansive.
- Also, this introduces the dependency of server/zookeeper instance

inside

the WAL code path, which seems to be not there till now. Is that an
explicit choice to keep them separate?
- If it seems like a useful change, should I open a JIRA and add patches
and seek feedback there?
--
Nikhil Bafna | 8095234263

Nikhil Bafna

2018-12-10 12:02:19 UTC

I got the HDFS block report for my cluster. We run with 7-8 RS groups. The
data (primary + replica) is assigned within RS Group since we use RS Group
Load Balancer. On the WAL files, only the primary is the same server as the
hbase region server. The secondary and tertiary replicas are spread across
the cluster, irrespective of RS groups.

In this case, multi-tenancy is still not complete without WAL isolation to
the rsgroup/tenant.

--
Nikhil Bafna | 8095234263

Post by Wellington Chevreuil
one thing that comes to my mind is that most of the time,

Post by Wellington Chevreuil
within a normal functioning hdfs as the file system, WAL files blocks
would already be located on nodes of the given RS group due data
locality

The primary node hosting the WAL blocks would be the same as the region
server. But, would the secondary and tertiary servers for the WALs also be
within the same RS Group by default? From the code, I don't see any hints
passed to HDFS during WAL output stream create to indicate this.
--
Nikhil Bafna | 8095234263
On Mon, Dec 10, 2018 at 4:59 PM Wellington Chevreuil <

I'm looking at extending HBASE-6721 to apply it to WALs such that WALs

are

created & replicated within an RSGroup. This extends multi-tenancy to

WALs

inside

Wellington Chevreuil

2018-12-10 12:24:23 UTC

Indeed, replicas could still go to DNs outside the given wal group, so
maybe worth open a jira with the proposed changes, as it would be
easier for review. As mentioned before on my previous email, you might
want to target it to newer branches other than branch-1.
Em seg, 10 de dez de 2018 às 12:03, Nikhil Bafna

Post by Nikhil Bafna
I got the HDFS block report for my cluster. We run with 7-8 RS groups. The
data (primary + replica) is assigned within RS Group since we use RS Group
Load Balancer. On the WAL files, only the primary is the same server as the
hbase region server. The secondary and tertiary replicas are spread across
the cluster, irrespective of RS groups.
In this case, multi-tenancy is still not complete without WAL isolation to
the rsgroup/tenant.
--
Nikhil Bafna | 8095234263

Post by Wellington Chevreuil
one thing that comes to my mind is that most of the time,

Post by Wellington Chevreuil
within a normal functioning hdfs as the file system, WAL files blocks
would already be located on nodes of the given RS group due data
locality

The primary node hosting the WAL blocks would be the same as the region
server. But, would the secondary and tertiary servers for the WALs also be
within the same RS Group by default? From the code, I don't see any hints
passed to HDFS during WAL output stream create to indicate this.
--
Nikhil Bafna | 8095234263
On Mon, Dec 10, 2018 at 4:59 PM Wellington Chevreuil <

I'm looking at extending HBASE-6721 to apply it to WALs such that WALs

are

created & replicated within an RSGroup. This extends multi-tenancy to

WALs

inside

Nikhil Bafna

2018-12-10 12:26:25 UTC

Yes, on both counts. Will open a JIRA and target 2.x.

On Mon, 10 Dec, 2018, 5:55 PM Wellington Chevreuil <

Post by Wellington Chevreuil
Indeed, replicas could still go to DNs outside the given wal group, so
maybe worth open a jira with the proposed changes, as it would be
easier for review. As mentioned before on my previous email, you might
want to target it to newer branches other than branch-1.
Em seg, 10 de dez de 2018 Ã s 12:03, Nikhil Bafna

Post by Nikhil Bafna
I got the HDFS block report for my cluster. We run with 7-8 RS groups.

The

Post by Nikhil Bafna
data (primary + replica) is assigned within RS Group since we use RS

Group

Post by Nikhil Bafna
Load Balancer. On the WAL files, only the primary is the same server as

the

Post by Nikhil Bafna
hbase region server. The secondary and tertiary replicas are spread

across

Post by Nikhil Bafna
the cluster, irrespective of RS groups.
In this case, multi-tenancy is still not complete without WAL isolation

Post by Nikhil Bafna
the rsgroup/tenant.
--
Nikhil Bafna | 8095234263

Post by Wellington Chevreuil
one thing that comes to my mind is that most of the time,

Post by Wellington Chevreuil
within a normal functioning hdfs as the file system, WAL files blocks
would already be located on nodes of the given RS group due data
locality

The primary node hosting the WAL blocks would be the same as the region
server. But, would the secondary and tertiary servers for the WALs

also be

Post by Wellington Chevreuil
within the same RS Group by default? From the code, I don't see any

hints

Post by Wellington Chevreuil
passed to HDFS during WAL output stream create to indicate this.
--
Nikhil Bafna | 8095234263
On Mon, Dec 10, 2018 at 4:59 PM Wellington Chevreuil <

.invalid>

I'm looking at extending HBASE-6721 to apply it to WALs such that

WALs

Post by Wellington Chevreuil
are

created & replicated within an RSGroup. This extends multi-tenancy

Post by Wellington Chevreuil
WALs

also, and not just cover Hbase data. I was working out of 1.2.x

code.

The approach I'm using is
- Strategy interface for WAL placement on the filesystem. Default to
delegate it to respective filesystem (which is the old behavior).
FavoredNode strategy computes the favoured nodes from the RSGroup
memberships.
- FavoredNode strategy requires instance of hbase.Server, to get the
current server name and an instance of Zookeeper watch to listen for
changes to RSGroup memberships
- The strategy is initialised in HRegionServer init and set in a

static

field in DefaultWALProvider
- DefaultWALProvider.Writer takes the strategy in its init, and

invokes

Post by Wellington Chevreuil
it

before output stream creation and passes the favored nodes

information

to DistributedFileSystem.create()
Few questions
- Any glaring miss in the approach?
- I have hesitation in setting the strategy in static field in
DefaultWALProvider. I would have preferred it to be passed in "init"
itself, but that change seems to be too expansive.
- Also, this introduces the dependency of server/zookeeper instance

inside

the WAL code path, which seems to be not there till now. Is that an
explicit choice to keep them separate?
- If it seems like a useful change, should I open a JIRA and add

patches