What is Stream Processing?

Stream Processing is simply the processing of data on the fly without keeping a lot of intermediate state. Anyone who has used the | pipe in Unix has done stream processing, the data is come in from a file or some other source, and you're processing it using some other shell command on the other end.

Often people underestimate how fast modern machines are before they deploy complex distributed Stream Processing applications. Often you can take a small dataset and try unix tools like awk to get a feel for the data, or solve the data partition problem with a small program before putting the effort into deploying a distributed system like Spark. Having a good intuition about the structure of the underlying data can save you a lot of headaches later.

results matching ""

    No results matching ""