PinnedGuide for Apache Spark Setup, Job Optimisation, AWS EMR Cluster Configuration, S3, YARN and HDFS OptimisationHow to tune Apache Spark Job for optimizations? How to perform join efficiently? How to tune AWS EMR Cluster for optimizations? How to tune S3 for optimizations? How to tune YARN for optimizations? How to tune HDFS for optimizations? How to Apache Spark Job fix errors? How to fix AWS…Spark10 min read
Sep 11, 2020Docker KeynotesDeployment Architecture: Docker container can be deployed on: 1. Container orchestration tool like Kubernetes, Docker Swarm, OpenShift, etc for production purpose 2. Docker Daemon for development purpose Container Isolation Architecture2 min read
Dec 11, 2019Install maven JAR dependencies in ZeppelinThis is step by step tutorial to install Maven dependencies in Zeppelin. Get the dependency declaration from Maven Respository website Create Zeppelin dependency declaration in following format: <groupId>:<artifactoryId>:<version> = ml.dmlc:xgboost4j-spark:0.90 Stop/Restart Zeppelin spark and run the command %dep z.load(“<groupId>:<artifactoryId>:<version>”) Example: %dep z.load(“ml.dmlc:xgboost4j-spark:0.90”)Java1 min read