netty 4-based rpc system development
DESCRIPTION
TRANSCRIPT
Netty 4-based RPC System Development
Allan Huang @ NEUTEC.com
Agenda
Netty 4 RPC System Design Performance Test
Netty 4
Netty 4
An Non-blocking I/O (NIO) client-server framework for the development of Java network applications such as protocol servers and clients.
Asynchronous Event-Driven network application framework is used to simplify network programming.
Reactor Pattern is an Event Handling pattern for handling service requests delivered concurrently to a service handler by one or more inputs.
Netty 4 Components
Netty 4 Thread Model
RPC System Design
RPC
Remote Procedure Call Remote Invocation, Remote Method Invocation (RMI)
Stub Acts as a gateway for Caller and all outgoing requests
to Callee that are routed through it. Skeleton
Acts as gateway for Callee and all incoming Caller requests are routed through it.
Parameters Marshalling & Unmarshalling Packs & Unpacks the parameters.
RPC
Features
NIO-based client and server with TCP socket Many-to-many relationship among servers and clients
with multi-channels Parameters Marshalling & Unmarshalling by JSON Idle Channels Detection Inactive Channels Reconnection Cross-platform remote invocation by JSON and TCP
socket High Availability
RPC Flow
RPC Deployment
Client-side Component
Client Channel Manager Manages all channel proxies, starts up and shuts down Netty
thread pool. It’s like one JDBC Driver Manager conceptually. Key Techniques
concurrent.ConcurrentHashMap concurrent.locks. ReentrantLock concurrent.locks.Condition
Client Channel Proxy Wraps a Netty channel. It’s like one JDBC Connection
conceptually. Every request must have an unique ID per channel. Key Techniques
concurrent.ConcurrentHashMap
Client-side Component
Client Channel Initializer Creates the needed channel handlers in client-side. Heartbeat Handler
Sends a useless message to a remote server if it idles too long. Delimiter Based Frame Decoder
Splits the received string. UTF8 String Encoder and Decoder
Encodes / Decodes a requested / received string. Client Channel Handler
The most important channel handler. Processes all data that is sent to or received from channel in client-side.
Server-side Component
Server Channel Manager Starts up and shuts down Netty thread pool.
Server Channel Initializer Creates the needed channel handlers in server-side. Delimiter Based Frame Decoder
Splits the received string. UTF8 String Encoder and Decoder
Encodes / Decodes a requested / received string. Server Channel Handler
The most important channel handler. Processes all data that is sent to or received from channel.
RPC Core Component
Command A container that wraps a skeleton's ID, a method that will be
invoked, and the needed parameters that will be inputted.
Result A container that wraps a object that callee is invoked and
returns. This object is a general object or an exception.
Request A container that wraps a command. It’s like one HTTP Request
conceptually.
Response A container that wraps a result object. It’s like one HTTP
Response conceptually.
Response Future
Wraps an Asynchronous computation into a Synchronous (blocking) computation.
Key Techniques concurrent.CountDownLatch concurrent.locks.ReentrantLock,
concurrent.locks.Condition
Marshalling & Unmarshalling
A command that is wrapped in one request is serialized or deserialized by JSON; likewise, a result that is wrapped in one response does.
An JSON serialization utility is based on GSON library. GSON has a good performance in JSON conversion.
If you don’t or can’t deploy an Java 7-based remoting client component, you can connect to a remote server via TCP socket. However, lack of its advantages is a definite fact.
Each request is split by a byte array with a “zero” value.
Service Stub & Skeleton
Service Stub Creates a command, selects a channel proxy, sends the
command, and gets a result finally.
Service Skeleton Invokes the matching business logic object by a command and
returns a result.
Command Executor Command Director
Finds the matching callee and invokes it by naming rule and hard-coding.
Command Reflector (Unofficial) Finds the matching callee and invokes it by naming rule and Java
Reflection API.
High Availability
Load balance Dispatches one of channel proxies of the different
server to the stub by according to Round-Robin rule.
Fail over Skips the broken channel proxy and finds the next
available channel proxy. If no channel proxy is available, all stubs will wait
until any channel proxies is reconnected.
Administration
Admin Servlet Auto-Deploy by @WebServlet Annotation HTTP URL
http://${client.host}/${context.path}/remoting/admin.do
HTTP parameters action
The Instruction’s name for administration. host
The host / IP of Netty server. port
The port number that remote Netty server listens on.
Management Actions
List Lists all channel proxies and shows their status.
Stop Closes Netty channels and stops accepting any command.
Pause Channels paused and don’t accept any command temporarily.
Restart Reconnects Netty channels and start to accept any command
again.
Performance Test
Remoting Server Properties
Configuration File remoting_server.properties
server.local.port.{n} The port number that Netty server listens on.
server.event.executor.size , default: 8 The number of Event Executor Threads is placed in Netty server.
server.io.thread.size , default: 4 The number of I/O threads is placed in Netty server. It's also Child
Group Threads in Netty Terms.
command.executor.implementor.className The class name of a Command Executor implementor
Remoting Client Properties
Configuration File remoting_client.properties
remote.server.host.{n} The host / IP of Netty server.
remote.server.port.{n} The port number that remote Netty server listens on.
client.channel.size, default: 8 The number of channels is opened between a Netty client and a Netty
server.
client.event.executor.size , default: 8 The number of Event Executor Threads is placed in Netty client.
Functional Test
Auto-Validation Client A generates a fixed-length random string and sends it to
Server B. Server B receives a string sent by Client A and send the same
string back to Client A again. If Client A receives the string is different to the original string sent
by it, It logs the related error message in the log file.
Simulation Simulates a large number of urgent requests are sent by Java
7.0 Fork / Join framework. concurrent.ForkJoinPool that handles multi-threads is superior to
concurrent.ExecutorService
Test Parameter (1)
Configuration simulator.properties
client.test.repeat.time The times of a Netty client repeatedly executes a set of test
cases. It is only applied to the urgent mode.
client.request.size The number of requests does be send at once from a Netty client
to a Netty server. All requests are sent by ten threads. It is only applied to the urgent mode.
client.sample.length The length of the random sample string. It is applied to all kinds
of request modes.
Test Parameter (2)
client.request.period The period of a Netty client send a request in milliseconds. All
requests are sent by ten threads. It is only applied to the heavy mode. It is only applied to the heavy / normal mode.
client.request.mode The frequency mode that a Netty client sends requests to a Netty
server according to. Urgent
Simulates a situation when a great number of urgent requests are coming at once.
Heavy Simulate a situation when many requests are coming continuously.
Test Environment
Hardwares 10.10.9.203
Linux PC, version 2.6.18-308.el5, AMD 64-bit 4-core processors.
10.10.9.221 Linux PC, version 2.6.18-194.el5, AMD 64-bit 4-core processors.
10.10.9.225 Windows 7 PC, 2-core processors.
Softwares Java 1.7.0_xx or higher version.
Server VM Arguments -server -Xmx1024m -Xms1024m -XX:PermSize=128m -XX:MaxPermSize=128m
Client VM Arguments -server -Xmx512m -Xms512m -XX:PermSize=64m -XX:MaxPermSize=64m
YourKit Java Profiler 11.0.8 version.
Test Case
Test Data – Server-side
Test Case No.
Scenario IPTotal Exec
(num)Total Exec Time (ms)
Avg Exec Time (ms)
Max Exec Time (ms)
Min Exec Time (ms)
CPU Peak (%)
Used Heap (MB)
Peak Threads
Total Thread
s
GC (freq)
101 1 : 1 10.10.9.203
100,000
10,851 0.108511 143.743 0.018 48% 60 23 30 9
102 1 : 1 10.10.9.203
500,000
27,596 0.055191 367.458 0.018 44% 41 23 28 38
103 1 : 1 10.10.9.203
1,000,000
47,585 0.047585 404.478 0.018 48% 226 23 30 73
104 1 : 1 10.10.9.203
5,000,000
214,425 0.042885 323.704 0.016 25% 247 23 28 358
105 1 : 1 10.10.9.203
7,500,000
324,398 0.043253 147.307 0.017 21% 278 21 24 535
106 m : 1 10.10.9.203
200,000
14,934 0.074668 144.637 0.018 46% 175 23 29 16
107 m : 1 10.10.9.203
1,000,000
47,551 0.047551 151.516 0.018 44% 208 23 30 73
108 m : 1 10.10.9.203
2,000,000
92,952 0.046476 498.322 0.018 41% 293 23 29 144
109 m : 1 10.10.9.203
5,000,000
205,735 0.041147 728.62 0.017 45% 276 23 28 357
110 m : 1 10.10.9.203
10,000,000
425,830 0.042583 418.904 0.016 32% 275 23 28 713
Test Data – Client-sideTest Case
No.Scenario IP
Total Exec (num)
Total Exec Time (ms)
Avg Exec Time (ms)
Max Exec Time (ms)
Min Exec Time (ms)
CPU Peak (%)
Used Heap (MB)
Peak Threads
Total Threads
GC (freq)
101 1 : 1 10.10.9.221
100,000
820,846 8.208458 681.254 0.777 99% 92 125 131 22
102 1 : 1 10.10.9.221
500,000 3,080,054
6.160107 580.084 0.954 73% 112 125 130 96
103 1 : 1 10.10.9.221
1,000,000 5,884,419
5.884419 414.946 0.772 100% 136 125 130 183
104 1 : 1 10.10.9.221
5,000,000
90,317,740 18.063548 4732.995 0.773 94% 390 124 132 1100
105 1 : 1 10.10.9.221
7,500,000
247,923,413 33.056455 1368.384 0.811 96% 461 124 131 2548
106m : 1 10.10.9.221
100,000
802,180
8.021803 506.495 0.917 99% 75 125 131 22
m : 1 10.10.9.225
100,000 1,770,922
17.709223 1044.761342 1.058135 95% 151 121 126 21
107m : 1 10.10.9.221
500,000
3,383,983
6.767965 576.605 0.887 100% 150 125 131 96
m : 1 10.10.9.225
500,000 6,366,195
12.73239 1422.267738 0.88364 93% 202 121 127 94
108m : 1 10.10.9.221
1,000,000
9,531,271
9.531271 548.039 0.86 97% 115 124 131 183
m : 1 10.10.9.225
1,000,000
13,897,758 13.897758 1741.945942 1.003693 99% 172 121 126 180
109m : 1 10.10.9.221
2,500,000
33,853,160
13.541264 1134.968394 1.022538 98% 256 124 131 438
m : 1 10.10.9.225
2,500,000
24,014,948 9.605979 687.003 0.817 99% 261 121 126 433
110m : 1 10.10.9.221
5,000,000
96,517,855
19.303571 5378.937 0.83 96% 341 124 131 1100
m : 1 10.10.9.225
5,000,000
159,876,805 31.975361 6135.102405 0.844204 100% 339 121 126 1083
Test Result
Profiler always logs exceptions were thrown by Netty framework. java.lang.ClassNotFoundException
io.netty.util.internal.PlatformDependent.javaVersion0() io.netty.util.internal.PlatformDependent.hasJavassi0() io.netty.util.internal.PlatformDependent.isAndroid0()
java.lang.SecurityException io.netty.util.internal.chmv8.ConcurrentHashMapV8.getUnsafe()
In test case No.105 and No.110, Thread Deadlocks sometimes are occurred in Netty client-side.
Conclusion
5 millions or less of requests can be processed in Netty client-side in short time.
All requests can be processed rapidly in Netty server-side, even though a total number of requests is greater than 10 million.
All test cases cannot get realer data since heavy user request simulator and remoting client component share the same JVM resource.
Better server grade? More memory allocation?
Reference (1) Netty in Action Newest netty Questions - Stack Overflow Reconsider the built-in functionality that allows a user specify an Ev
entExecutor for a handler #1912 Netty.docs: New and noteworthy in 4.x Netty.docs: User guide for 4.x See All. Hear All.: Netty Tutorial Part 1: Introduction to Netty See All. Hear All.: Netty Tutorial Part 1.5: On Channel Handlers and
Channel Options Wrapping an asynchronous computation into a synchronous
(blocking) computation Never awaitUninterruptibly() on Netty Channels
Reference (2) What's the best way to reconnect after connection closed in Netty Netty TCP client with reconnect handling Netty High Availability Cluster http://www.coderli.com/category/open-source/distributed/netty http://blog.csdn.net/zxhoo/article/category/1800249 http://hongweiyi.com/2014/01/netty-4-x-thread-model/ Gson User Guide Gson Design Document
Q&A