It is very uncommon in enterprise world where applications should consume data from files, which range to millions of records and often containing duplicates. The application requirements could be filter out the data , insert it in to a database or queue for further processing.Camel has a component called Idempotent Consumer for the exact purpose,. Combined with the File component, let us examine how efficient it can be.
We are going to use a file with 1 million orders (111112 records are duplicate ) enclosed by orders tag as below , we will read this file , split the file and then insert the order records to the Order table and product records to the product table.
Now obviously loading this file completely in to memory to process is not always a feasible approach , so camel provides us with the Splitter EIP along with its streaming capability , what this does is to read the file chunk by chunk based on the token provided as delimiter and stream these results ( advantage there is no xml models loaded , we can receive the chunk as a plain string).
The code snippet reads the xml file and splits it at orders , marshals the tokenized xml to order object and aggregates 1000 orders before sending it to the next end point , Observe lines 14 and 16 , there are two completion strategies provided to the aggregator.The line number 16 ensures that the aggregate does not keep waiting infinitely for the 1000 configured records to arrive ( and there by making the program to hand forever if they do not come / or not present ) and time out the provided mill second lapse.
So the above code reads a file splits the xml as tokens and aggregates them by 1000 before sending it to DB end point , what about filtering the records that we talked about. Would you believe adding 4 lines of code would now enable us do so , that is exactly how powerful camel can get.
Just add the value ( could be the node in the xml , or a value of the node that needs to be unique ) that needs to be unique to the header , add a idempotent repository component and provide the header to the repository as key as the basis for duplicate filtering.And just like that we have a code that reads from large file filters duplicate orders and sends to a different endpoint for further processing.
I was able to achieve about 10582 records inserted to db per second and I have not even started to look in to improving the code for memory reduction, hit me with your comments.
For the complete Code and the processing details refer to my Git Hub Link Here.
Friday, April 29, 2016
Wednesday, April 27, 2016
Colocated Symmetrical Live and Backup Cluster on Jboss EAP - With Parameterization
For
demonstrating the HA fail-over mode with parameters, this article will use two
nodes, the configuration can however be extended as needed to numerous nodes.
Traditionally for the Collocated failover mode setup, the full ha profile in
the domain.xml will be replicated (after the hornetq server is itself copied
over in the profile to create a back-up), this however will become tedious
approach when there are say 4 or 5 cluster live and back up combinations. Follow the below steps and life should be a
little more easy
Step 1 :
Dowload the jboss eap 6.4 version
Make two copies of the unzipped
folder as master and node.
Step 2 :
Open domain.xml under the
master/domain/configuration directory
Delete the profile default , ha ,
full
Navigate to the messaging
subsystem under full-ha profile
Copy the hornetq-server and paste
it right after the first hornet-server section
Provide a name to the second
hornetq-server to differentiate it from the first , it can be any name , let us
choose backup as the name.
Step 3: Make changes
to the default hornet-server section as below
Step 4: Change the configuration the back-up server created by copying the original hornetq-server as below
The line number 6 is changed to appropriately represent that
this server will act as a back up . The backup group name is now represented by
the parameter groupb .Also the server-id is now incremented to 2 to make it
different from the live server running on
the same server . Also
observe the changes on the socket-binding to messaging2 , this is to cater to
the need that the backup server belonging to ${groupb} will come up when its
master on a different server comes down . During such times this will help
resolve the port-conflict that will arise with the groupa live server running
on the current server. ( Jump to Step 4 to see the actual configuration of the
socket-binding )
Step 4 :
Move to the
full-ha-sockets section of the socket-binding-group, create a new
socket-binding named messaging2 and provide the port as 5446 ( you can opt to
provide any number which would be different from the messaging socket-binding
and which will not conflict with the other eap ports , this is a tested out
port).
Step 5 :
Move to the
server-groups section and remove one of the server-group referring to the “full” profile . Change the server group
name to “hornetqparamcluster”, this
can be any thing or you can choose to leave the name as is, just remember the
name.
Step 6:
Create a copy of the host-slave.xml
file under the master/../configuration directory.
Step 7 :
Run the script ./add-user.sh under
the master/bin directory ( this will be domain controller node ) , create a
Management Realm user called admin .
Run the script once again and
create a user called myhornetqcluster , during the interactive steps , as shown
below.
Step 8 :
- Open the host-slave.xml file under the master/../configuration directory and add the below text under the management security realm node
<secret value="xxxxxx"/>
</server-identities>
where xxxxxx is the text copied in the Step 7.
- Move to the domain-controller section and remove the <local/> node and uncomment the <remote> node, add the attribute username to the node with the value as “myhornetqcluster” created in step 7.
- Repeat the above two steps for the file host-slave-node.xml
- Move to the servers section of the file host-slave.xml and change the group of the server to hornetqparamcluster .
- Add the system properties section to provide values to the ${groupa} and ${groupb} parameters , provide the offset to 1000. Remove the second server configuration.
- Repeat the above step by editing the host-slave-node.xml but reverse the values of groupa and groub parameters , also provide the offset to 2000 , (remember we are running all the nodes on the single server ).
Step 9
- Start the domain controller by running domain.sh under /master/bin as below
./domain.sh --domain-config=domain.xml
--host-config=host-master.xml -b=192.168.56.101
-bmanagement=192.168.56.101
- Start the hornetq live server group by running the domain.sh under /master/bin in a separate window
./domain.sh --domain-config=domain.xml
--host-config=host-slave.xml
-b=192.168.56.101 -bmanagement=192.168.56.101
-Djboss.domain.master.address=192.168.56.101 -Djboss.management.native.port=9993
- Start the hornet second live server group by running the domain.sh under /node/bin in a separate window
./domain.sh --domain-config=domain.xml
--host-config=host-slave-node.xml
-b=192.168.56.101 -bmanagement=192.168.56.101
-Djboss.domain.master.address=192.168.56.101 -Djboss.management.native.port=9993
Change the IP address to the address of your server.
Subscribe to:
Posts (Atom)