HDFS Centralized Cache Management

Due to increasing memory capacity, many interesting working sets are able to fit in aggregate cluster memory. By using HDFS centralized cache management, applications can take advantage of the performance benefits of in-memory computation. Cluster cache state is aggregated and controlled by the NameNode, allowing applications schedulers to place their tasks for cache locality.

If your applications repeatedly access the same data, you can get improved performance by enabling centralized cache management in HDFS. You need to specify paths to directories or files that will be cached by HDFS. The NameNode will communicate with DataNodes that have the desired blocks available on disk, and instruct the DataNodes to cache the blocks in off-heap caches.

Caching Use Cases

Files that are accessed repeatedly: For example, a small fact table in Hive that is often used for joins is a good candidate for caching.

Mixed workloads with performance SLAs: Caching the working set of a high priority workload ensures that it does not compete with low priority workloads for disk I/O.

Caching Terminology

Cache Directive: It defines the path that will be cached either directories or files. Directories are cached non-recursively, meaning only files in the first-level listing of the directory will be cached.

Cache Pool: It is an administrative entity used to manage groups of Cache Directives. Cache Pools have UNIX-like permissions that restrict which users and groups have access to the pool. Write permissions allow users to add and remove Cache Directives to the pool. Read permissions allow users to list the Cache Directives in a pool, as well as additional metadata. Execute permissions are unused.

Configuring Centralized Caching

Native Libraries : In order to lock block files into memory, the DataNode relies on native JNI code found in libhadoop.so ($HODOOP_HOMElib/native). Be sure to enable JNI if you are using HDFS centralized cache management.

JNI is the Java Native Interface. It defines a way for managed code (written in the Java programming language) to interact with native code (written in C/C++).

Configuration Properties: Configuration properties for centralized caching are specified in the hdfs-site.xml file.

Required Properties

Currently, only one property dfs.datanode.max.locked.memory is required which determines the maximum amount of memory (in bytes) that a DataNode will use for caching. The "locked-in-memory size" ulimit (ulimit -l) of the DataNode user also needs to be increased to exceed this parameter. On HDP 2.6 , add this property to HDFS service in custom hdfs-site.xml and restart service.

<property>
<name>dfs.datanode.max.locked.memory</name>
<value>268435456</value>

</property>

[root@nn01 ~]# ulimit -l

64

[root@nn01 ~]# ulimit -Ha
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 127526
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 4096
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 127526
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Set memlock limits on each datanode. This will take effect after you logout and login again.

# On each datanode (max cacheable memory in KB) example for 64KB

echo "* hard memlock 65536" >> /etc/security/limits.conf
echo "* soft memlock 65536" >> /etc/security/limits.conf

[root@dn01 ~]# echo "* hard memlock 65536" >> /etc/security/limits.conf
[root@dn01 ~]# echo "* soft memlock 65536" >> /etc/security/limits.conf

[root@nn01 ~]# ulimit -Sa

-H option instructs the command to display hard resource limits.

-S option instructs the command to display soft resource limits.

Optional Properties

The following properties are not required, but can be specified for tuning.

dfs.namenode.path.based.cache.refresh.interval.ms : NameNode will use this value as the number of milliseconds between subsequent cache path re-scans.

<property>
<name>dfs.namenode.path.based.cache.refresh.interval.ms</name>
<value>300000</value>
</property>

dfs.time.between.resending.caching.directives.ms : NameNode will use this value as the number of milliseconds between resending caching directives.

<property>
<name>dfs.time.between.resending.caching.directives.ms</name>
<value>300000</value>
</property>

dfs.datanode.fsdatasetcache.max.threads.per.volume : DataNode will use this value as the maximum number of threads per volume to use for caching new data.

<property>
<name>dfs.datanode.fsdatasetcache.max.threads.per.volume</name>
<value>4</value>
</property>

dfs.cachereport.intervalMsec : DataNode will use this value as the number of milliseconds between sending a full report of its cache state to the NameNode.

<property>
<name>dfs.cachereport.intervalMsec</name>
<value>10000</value>
</property>

dfs.namenode.path.based.cache.block.map.allocation.percent : The percentage of the Java heap that will be allocated to the cached blocks map.

<property>
<name>dfs.namenode.path.based.cache.block.map.allocation.percent</name>
<value>0.25</value>

</property>

OS Limits

If you get the error "Cannot start datanode because the configured max locked memory size...is more than the datanode's available RLIMIT_MEMLOCK ulimit," that means that the operating system is imposing a lower limit on the amount of memory that you can lock than what you have configured. To fix this, you must adjust the ulimit -l value that the DataNode runs with.

Set the memlock (max locked-in-memory address space) to a slightly higher value than dfs.datanode.max.locked.memory property . For example, to set memlock to 130 KB (130,000 bytes) for the hdfs user, you would add the following line to /etc/security/limits.conf.

hdfs - memlock 130

Using Cache Pools and Directives

You can use the Command-Line Interface (CLI) to create, modify, and list Cache Pools and Cache Directives via the hdfs cacheadmin subcommand. Cache Directives are identified by a unique, non-repeating, 64-bit integer ID. IDs will not be reused even if a Cache Directive is removed. Cache Pools are identified by a unique string name. You must first create a Cache Pool, and then add Cache Directives to the Cache Pool.

-- Add a new Cache Pool

hdfs cacheadmin -addPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-limit <limit>] [-maxTtl <maxTtl>]

[hdfs@nn01 ~]$ hdfs cacheadmin -addPool cachePool1

Successfully added cache pool cachePool1.

--List existing pools

[hdfs@nn01 ~]$ hdfs cacheadmin -listPools

Found 1 result.

NAME OWNER GROUP MODE LIMIT MAXTTL

cachePool1 hdfs hadoop rwxr-xr-x unlimited never

--List with additional stats

[hdfs@nn01 ~]$ hdfs cacheadmin -listPools -stats

Found 1 result.

NAME OWNER GROUP MODE LIMIT MAXTTL BYTES_NEEDED BYTES_CACHED BYTES_OVERLIMIT FILES_NEEDED FILES_CACHED

cachePool1 hdfs hadoop rwxr-xr-x unlimited never 0 0 0 0 0

--List for specific cache pool

[hdfs@en01 ~]$ hdfs cacheadmin -listPools -stats cachePool1

Found 1 result.

NAME OWNER GROUP MODE LIMIT MAXTTL BYTES_NEEDED BYTES_CACHED BYTES_OVERLIMIT FILES_NEEDED FILES_CACHED

cachePool1 hdfs hadoop rwxr-xr-x unlimited never 0 0 0 0 0

-- Get help

[hdfs@en01 ~]$ hdfs cacheadmin -help

-- -- Add a new Cache Directive

hdfs cacheadmin -addDirective -path <path> -pool <pool-name> [-force] [-replication <replication>] [-ttl <time-to-live>]

[hdfs@nn01 ~]$ hdfs cacheadmin -addDirective -path /data/employee/emp.csv -pool cachePool1

Added cache directive 1

[hdfs@en01 ~]$ hdfs cacheadmin -addDirective -path /data/employee/emp.csv -pool cachePool1 -replication 3

Added cache directive 2

--List directives

[hdfs@nn01 ~]$ hdfs cacheadmin -listDirectives

Found 1 entry

ID POOL REPL EXPIRY PATH

1 cachePool1 1 never /data/employee/emp.csv

[hdfs@nn01 ~]$ hdfs cacheadmin -listPools -stats cachePool1

Found 1 result.

NAME OWNER GROUP MODE LIMIT MAXTTL BYTES_NEEDED BYTES_CACHED BYTES_OVERLIMIT FILES_NEEDED FILES_CACHED

cachePool1 hdfs hadoop rwxr-xr-x unlimited never 617 0 0 1 0

-- Remove a Cache Directive

[hdfs@en01 ~]$ hdfs cacheadmin -removeDirective 1
Removed cached directive 1

-- Remove all of the Cache Directives in a specified path.

hdfs cacheadmin -removeDirectives <path>

-- Look at the datanode stats, see that our one of DN is using 1 page of cache as our directive was with replication 1

[hdfs@nn01 ~]$ hdfs dfsadmin -report

Configured Capacity: 4366617690624 (3.97 TB)

Present Capacity: 4186504465137 (3.81 TB)

DFS Remaining: 4052661449784 (3.69 TB)

DFS Used: 133843015353 (124.65 GB)

DFS Used%: 3.20%

Under replicated blocks: 63

Blocks with corrupt replicas: 0

Missing blocks: 0

Missing blocks (with replication factor 1): 0

-------------------------------------------------

Live datanodes (3):

Name: 192.168.xx.xxx:50010 (dn02)

Hostname: dn02

Decommission Status : Normal

Configured Capacity: 1455539230208 (1.32 TB)

DFS Used: 44614710332 (41.55 GB)

Non DFS Used: 59216794052 (55.15 GB)

DFS Remaining: 1351170855272 (1.23 TB)

DFS Used%: 3.07%

DFS Remaining%: 92.83%

Configured Cache Capacity: 65536 (64 KB)

Cache Used: 0 (0 B)

Cache Remaining: 65536 (64 KB)

Cache Used%: 0.00%

Cache Remaining%: 100.00%

Xceivers: 10

Last contact: Tue Mar 20 14:03:33 AST 2018

Name: 192.168.xx.xxx:50010 (dn03)

Hostname: dn03

Decommission Status : Normal

Configured Capacity: 1455539230208 (1.32 TB)

DFS Used: 44614280252 (41.55 GB)

Non DFS Used: 59214389700 (55.15 GB)

DFS Remaining: 1351173689704 (1.23 TB)

DFS Used%: 3.07%

DFS Remaining%: 92.83%

Configured Cache Capacity: 65536 (64 KB)

Cache Used: 4096 (4 KB)

Cache Remaining: 61440 (60 KB)

Cache Used%: 6.25%

Cache Remaining%: 93.75%

Xceivers: 10

Last contact: Tue Mar 20 14:03:33 AST 2018

Name: 192.168.xx.xxx:50010 (dn01)

Hostname: dn01

Decommission Status : Normal

Configured Capacity: 1455539230208 (1.32 TB)

DFS Used: 44614024769 (41.55 GB)

Non DFS Used: 60071430079 (55.95 GB)

DFS Remaining: 1350316904808 (1.23 TB)

DFS Used%: 3.07%

DFS Remaining%: 92.77%

Configured Cache Capacity: 65536 (64 KB)

Cache Used: 0 (0 B)

Cache Remaining: 65536 (64 KB)

Cache Used%: 0.00%

Cache Remaining%: 100.00%

Xceivers: 10

Last contact: Tue Mar 20 14:03:34 AST 2018

DBMentors - Inam Bukhari's Blog

Pages

Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Tuesday, March 20, 2018

HDFS Centralized Cache Management

2 comments:

Translate

Followers

Labels

Blog Archive

About Me

Total Pageviews