Hadoop on Twitter 2015-02
2015-03-02 | #Me
Below the top media (and other statistics) about “Hadoop” search in Twitter in February 2015
2015-03-02 | #Me
Below the top media (and other statistics) about “Hadoop” search in Twitter in February 2015
2015-02-05 | #twitter
How easy is analyzing json twitter data using Apache Spark and Apache Hadoop. Below some examples Tweets about Expo2015 Tweets about Hadoop Tweets about opensource
2015-01-15 | #Me
Instead of using the old Hadoop way (map/reduce), I suggest using the newer and faster way (Apache Spark on top of Hadoop Yarn): in few lines you can open all tweets (zipped json files saved in several subdirectories hdfs://path/to/YEAR/MONTH/DAY/*gz) and query them in a SQL like language``` sc = SparkContext(appName=“extraxtStatsFromTweets.
2014-11-25 | #Me
Apache Spark has just passed Hadoop in popolarity on the web (google trends) My first Apache Spark usage was extracting texts from tweets I’ve been collecting in Hadoop HDFS. My python script tweet-texts.
2014-10-03 | #airflow #apache #bigdata #hadoop #luigi #oziee #workflow
How to orchestrate your Hadoop Jobs? Possible solutions are: Apache Oziee included in the top Hadoop distributions Azkaban from LinkedIn Luigi from Spotify Apache Airflow from AirBnb See for instance a comparison among luigi, airflow and pinball at http://bytepawn.