1. I was working with Apache Storm for sometime now. I wanted to get the messages from different message brokers and and process them inside Storm. The brokers I'm interested are RabbitMQ, JMS Based brokers like ActiveMQ, Appollo, Kestrel and Kafka. The existing spouts were lacking some of the functionality I was searching for.. The most important things for me was total control of the message and ability to read from multiple queues from a single spout.  So I thought of writing these spouts for various brokers and I have written RabbitMQ and JMS spouts. The source code can be found in github


    The software is under Apache License Version 2. So you are free to use it anyway you like.
    3

    View comments

  2. I was trying to start Apache Hadoop in my Ubuntu machine with the data node and name node on the same machine. I've changed the hdfs-site.xml as following.

    Loading ....
    Then when the datanode starts it said the permission of the data directory is incorrect.

    WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /home/supun/dev/apache/hadoop-data/data, expected: rwxr-xr-x, while actual: rwxrwxr-x

    So I had to change the permission of the data dir as following and it worked. 

    chmod 755 hadoop-data/data
    2

    View comments

  3. It seems finally humanity has found a way to create more efficient batteries.



    There is a long way to go. But the future seems promising. This can be one of the biggest inventions that changed the human history. Looking forward to a day where we can charge our cars within minutes, cell phones and other devices instantly.
    0

    Add a comment

  4. Here is a tutorial about some basics of perl that can be very handy.. The tutorial talks about how to connect to a database, reading a config file, replace a string in a file etc.
    0

    Add a comment

  5. Here is a simple regular expression for finding a sentence with a given word.

    Loading ....
    1

    View comments


  6. We were given an assignment in one of our classes to write a port scanner. This was a great assignment and we've learn't a lot from it. The assignment was pretty huge with lot of functions. For those of you who don't know what a port scanner is, a port scanner is a piece of software that can be used to determine the status of network ports in a given machine. A port scanner can also be used to determine what are the software running on open ports etc. Ports are specific to TCP and UDP and a port scanners functionality is border than TCP and UDP and can be used to determine the protocols that are running in machines etc.  A good place to look at for more details is here. For making our discussion simple lets assume we are only going to scan TCP ports.

    One of the key things about a port scanner is the ability to scan ports in parallel. Usually there are 2 to the power 16 ports in a given machine. But usually we want to scan a subset of these ports and even this subset can be huge. So if we scan each port one by one it will take a lot of time. To avoid this we have to send requests in parallel to the range of ports that we scan.

    A port scanning request for the most part is an IP packet with a TCP or UDP payload. In the TCP case we don't want to create a full TCP connection in order to determine weather the port is open. We can simply send a TCP SYN packet and if we get a response with SYN + ACK  we can conclude that the port is open. There are many other techniques for determining the status of a port and non of them require us to create a connection. To send these TCP SYN packets we need to use RAW Sockets and we cannot use regular stream sockets.

    The important summery of the above is that we create a TCP SYN packet for each port that we are going to scan and send them in parallel to the destination IP using a RAW Socket.

    Since these are RAW sockets there is no TCP Flow Control or TCP Congestion control. Once we call the send with the IP packet, the kernel will send the packet as fast as it can. First it will copy the buffer to the kernel space and hand over this to the IO Subsystem to send. The important thing is the call to send the packet is non blocking on IO as we are not waiting for the success or failure of this packets to be delivered to the destination or out of the host machine. But since send involves a system call it can be bit slow compared to other method calls. Also note that these are very small packets, with byte sizes no more that 50 bytes for the IP packet and there is no data in the TCP packet.

    The assignment asked us to create threads for making the sending process parallel. The expectation is that as the number of threads increases the sending speed will increase. But is this really true? Will threads increase the sending performance linearly?

    If we run the program on a multi-core machine definitely the increase of threads should increase the sending rate until the number of threads is somewhat greater than the available processors. But will this increase the performance if we increase the number of threads to 100 and available processes is 8? My gut feeling is it NO. But people tend to think that increasing the number of threads will increase the rate of sending. But since the operations are CPU bound increasing the number of threads beyond the number of available cores is not going to give any performance. On the contrary increasing the threads should decrease the performance of the system.

    This is a conclusion I came from theoretical knowledge and past experiences. No performance test was done. may be I should do a performance test or may be I should not :) Lets see what others have to say.








    0

    Add a comment

  7. I worked for three year in a company called WSO2, building Web Services based middle-ware systems. Five years ago we believed Web Services is going to take over the world by becoming the de-facto standard in communication. Web Services is such a nice and well thought technology. It has a metadata model, a rich set of tooling, a built in security model; which are essential parts of enterprise grade communication. Also WS is backed by industry giants like Microsoft, IBM, Oracle etc. No matter what this background is WS lacks mass adoption. It seems only the big companies with lot of money and resources invest on Web Service adoption. For sometime I was bothered by this strange situation. Why such a robust and well thought out messaging paradigm backed by industry giants lacks the adoption?

    On the other hand REST doesn't have any of the nice things about Web Services. It doesn't have a well thought metadata model, comprehensive tooling support or a standard security model. But against all odds REST seems to be gaining traction and becoming the cool thing in communication. Everyone wants to build a cool REST Api.

    So what is wrong with SOAP and what is so appealing about REST?

    In my perspective this all lies in the developer mind set. Developers love when the sky is the limit for their programming. They want to create unique designs that no one else has done. They want to explore the unknown.

    REST as a framework gives the developers exactly that. The freedom to innovate. Developers love that. As a framework REST introduces very few rules. Most of them are not even rules. Rather they are soft guidelines. So building a proper powerful REST API is an Art and a challenging task. In my opinion this makes it cool.

    This is completely opposite to SOAP. SOAP gives the developer all the rules. First thing about SOAP is you get a SOAP envelope where you have to put your data according to some rules. So from start to end the system is pretty well defined. There are standards for doing even the simplest things. Top of all this SOAP is very complex to understand. So in my opinion SOAP doesn't give user the nuts and bolts to innovate. Instead it gives a half baked solution that user has to obey. I guess not all developers like to bog down by rules and they don't like to learn that kind of rules as well.

    There are many aspects to REST vs SOAP than what I have mentioned above. Someone can argue SOAP is better than REST. But I just wanted to express my perspective on what is happening in the world.



    4

    View comments

  8. Download the Google Chrome from the internet.

    Go to the download folder and execute the command

    sudo dpkg -i google-chrome.deb

    For example:
    sudo dpkg -i google-chrome-stable_current_i386.deb

    If it fails due to un-satisfied dependencies

    Use the following command

    sudo apt-get -f install

    This will download the dependencies and install Google-Chrome.

    Enjoy!


    0

    Add a comment

  9. To count the code lines in a java program use the command

    find . -name "*.java" -not -iwholename '*target*' -not -iwholename "*.svn*" | xargs wc -l

    This will search for .java files and ignore the files in target and .svn directories.

    0

    Add a comment

  10. We can use the gpg utility to password protect a file in Unix based systems. The command

    gpg -c filename

    will ask for a password and create a new file with the extention .gpg. Now the new file is a password protected file.

    To decrypt the file, just use

    gpg "protectedfilename"

    It will ask for the password and decrypt the file.
    0

    Add a comment

About Me
About Me
Blog Archive
Labels
Loading