Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

A flexible two-column Jekyll theme. Perfect for personal sites, blogs, and portfolios hosted on GitHub or your own server. Latest release v4.9.0

Splash Page

Bacon ipsum dolor sit amet salami ham hock ham, hamburger corned beef short ribs kielbasa biltong t-bone drumstick tri-tip tail sirloin pork chop.

Posts

Automating GCE Image Creation with Packer

15 minute read

Ensuring consistency and streamlining infrastructure provisioning is crucial for effective cloud management. Creating custom virtual machine (VM) images on G...

Design IAM on Google Cloud Platform

14 minute read

Security is paramount when setting up resources in the cloud or on-premises. It encompasses various layers of protection, including network security, encrypt...

Exploring Data Extraction from BigQuery

3 minute read

BigQuery, Google’s fully-managed and serverless data warehouse, empowers organizations to analyze massive datasets with remarkable speed and efficiency. But ...

Effective Terraform Validation Techniques

4 minute read

Validation in Terraform is an essential practice to detect and prevent errors early in the infrastructure provisioning process. By incorporating robust valid...

Python - Virtual Environment.

2 minute read

Python virtual environment creates a isoloated workspace of python work. This helps in creating project specific virtual environment without worrying about ...

Cloud VPN - GCP Learning Notes.

12 minute read

Cloud VPN securely connects your peer network to your Virtual Private Cloud (VPC) network through an IPsec VPN connection.

Python - List Comprehensions.

9 minute read

List Comprehensions provides easy and functional way to create list in python. We could make a single line of code which otherwise would take a few lines. l...

Python Getting Started - Learning Notes.

66 minute read

Python is a clear and powerful object-oriented programming language, comparable to Perl, Ruby, Scheme, or Java. This is basic documentation for getting star...

getmerge Operation not Permitted

1 minute read

getmerge command takes a source directory and a destination file as input and concatenates files in src into the destination local file.

Cloud IAM - GCP Learning Notes.

8 minute read

Google Cloud offers IAM, which lets you give more granular access to specific Google Cloud resources and prevents unwanted access to other resources.

Long Running Jobs in YARN distcp.

1 minute read

Usually distcp and other batch jobs run for a very long time. This is fine if you are running a HADOOP environment without kerberos. When we kerberize a clus...

Checking HDFS health using fsck.

3 minute read

When we have large data sets on the cluster, there will be corruptions of blocks. This could be due to disk or any other.

MongoDB to Neo4j Using neo4j_doc_manager.

10 minute read

We had a requirement where we wanted to have all the data which is in mongodb to be replicated on neo4j to show few graphs. Here is quick way to demonstrate ...

Cropping Bulk Images Using Python.

3 minute read

I was working on getting post headers for my post on this blog. I had couple of images from unsplash. But the header for the post need to be a little more ho...

Kafka Kerberos Enable and Testing.

19 minute read

Apache Kafka is a distributed streaming platform. Kafka 2.0 supports Kerberos authentication, Enabling Kerberos Authentication Using the Wizard on cloudera m...

Parcel Not Distributing Cloudera CDH.

1 minute read

We were deploying one of the cluster on our lab environment which is used by everyone. So the lab has it own share of stale information on it.

Creating /etc/hosts file in Chef.

4 minute read

We had a cluster environment which we needed to update the /etc/hosts file. Which would help communicate between the server over a private network. Our serve...

Enable Kerberos Using Cloudera API.

9 minute read

Python API for cloudera is really nice, apart from getting the cluster setup, we can also do configuration and automation. We use a lot of automation using C...

Setting Hue to Listen on 0.0.0.0 [Cloudera]

less than 1 minute read

We were working on setting up a cluster, but the Hue URL was set to a private IP of the server. As we had setup all the nodes to access each other using a pr...

Nagios - Service Group Summary ERROR

22 minute read

We were working on nagios and found that after our migration, service group summary was not working. You might get below error on the screen and the solution...

Zabbix History Table Clean Up

8 minute read

Zabbix history table gets really big, and if you are in a situation where you want to clean it up. Then we can do so, using the below steps.

Windows Testing Using Kitchen Chef

18 minute read

Kitchen-Vagrant has the capability to spin up a windows instance for testing. To make it work you will need the vagrant-winrm to be installted on the worksta...

Package Installer for Cygwin [apt-cyg].

2 minute read

After a longtime I was on my windows machine and had to make it feel more like my linux machine. So install the thing what everyone else does cygwin. Surpise...

Installing CouchDB on Ubuntu 14 LTS.

1 minute read

CouchDB is a database that completely embraces the web. Store your data with JSON documents. Access your documents and query your indexes with your web brows...

Installing MongoDB on Ubuntu 14 LTS.

9 minute read

MongoDB is an open-source document database, and leading NoSQL database. MongoDB is written in c++. Below is a brief document about installing a mongodb on a...

Encrypted Data Bags - Chef

17 minute read

Data Bags are a way to store information on the chef-server which all the cookbooks can access. Few more additional advantages are that we can encrypt the da...

Remove Old Files using find Command

2 minute read

GNU find searches the directory tree rooted at each given file name by evaluating the given expression from left to right, according to the rules of preceden...

Setting up SSL https On Nagios XI Server

7 minute read

HTTPS is a protocol for secure communication over a computer network which is widely used on the Internet. HTTPS consists of communication over Hypertext Tra...

Update hosts file in Windows 8

3 minute read

Host file contains IP followed by the FQDN which can be used to reach that IP address. Host file takes precedence over your DNS servers. In Microsoft operati...

Creating Documents Using pandoc

8 minute read

Pandoc is an opensource utility to create documents from markdown. We can create PDF, Doc, doc, html and other formats. And can be also used to convert html ...

Setup/Configuration Nagios XI on Centos6.6

13 minute read

Nagios monitors your entire IT infrastructure to ensure systems, applications, services, and business processes are functioning properly. In the event of a f...

RPM Command Cheat Sheet

2 minute read

RPM (Redhat Package Manager) is the most popular package utility and is used mostly on RHEL, Centos and Fedora. RPM helps user/admins to build, install, que...

Chef Workstation Setup on Windows Machine.

15 minute read

The Chef Development Kit (ChefDK) brings the best-of-breed development tools built by the awesome Chef community to your workstation with just a few clicks. ...

Check Port on Remote Server CentOS 6.6/RHEL 6

less than 1 minute read

Checking port available on a remote machine using nc command instead of telnet. Same command can be used to check on a remote server as well, change the 127....

LUKS Disk encryption for CentOS 6.6/RHEL 6

16 minute read

Linux Unified Key Setup-on-disk-format (or LUKS) allows you to encrypt partitions on your Linux computer. This is particularly important when it comes to mo...

Mysql Database Disk Usage.

6 minute read

We were running out of disk space on one of the databases server, we need to get information on what the current table/database usage was. Below are few comm...

Installing python 2.7.x on Centos 6.5/6.6

1 minute read

By default centos comes with python 2.6. In most of the cases we might need python 2.7 or later to be installed. Below are few ways to install python 2.7 on ...

Zabbix Template Creation using CSV file.

8 minute read

In zabbix we dont have a better way to capture snmptraps. we have to manually create an item and corresponfding trigger to handle a trap arriving from the de...

Setting up SNMP Trapper for Zabbix.

11 minute read

Receiving SNMP traps is the opposite to querying SNMP-enabled devices. In this case the information is sent from a SNMP-enabled device and is collected or “t...

Access Filter Setup with SSSD

3 minute read

If using access_provider = ldap, this option is mandatory. It specifies an LDAP search filter criteria that must be met for the user to be granted access on ...

Getting started with Hive with Kerberos.

9 minute read

Apache Hive is a powerful data warehousing application built on top of Hadoop; it enables you to access your data using Hive QL, a language that is similar t...

Installing ansible on RHEL 6.6.

4 minute read

Ansible is a radically simple IT automation engine that automates cloud provisioning, configuration management, application deployment, intra-service orchest...

Mounting RAID10 using parted.

11 minute read

GUID Partition Table (GPT) is a standard for the layout of the partition table on a physical hard disk, using globally unique identifiers (GUID). Although it...

Ansible Playbook - Setup Storm Cluster.

2 minute read

This is a simple Storm Cluster Setup. We are using a dedicated Zookeeper Cluster/Node, instead of the standalone zkserver. Below is how we will deploy our c...

Ansible Playbook - Setup Kafka Cluster.

2 minute read

This is a simple Kafka setup. In this setup we are running kafka over a dedicated zookeeper service. (NOT the standalone zookeeper which comes with kafka)

Streaming Data Processing - Storm Vs Spark.

6 minute read

Apache Spark is an in-memory distributed data analysis platform– primarily targeted at speeding up batch analysis jobs, iterative machine learning jobs, inte...

Tuning Hadoop Performance with sysctl.conf

13 minute read

This post delves into optimizing Hadoop performance at the kernel level using sysctl. The sysctl interface provides a way to dynamically modify a running Lin...

Performance Tuning for nginx

6 minute read

Nginx (pronounced “engine x”) is a powerful and versatile web server renowned for its high concurrency, exceptional performance, and efficient memory utiliza...

How To Configure Swappiness

7 minute read

Swappiness is a crucial Linux kernel parameter that dictates how aggressively the system uses swap space. Understanding and configuring swappiness can signif...

Setting up Tomcat Cluster for SpagoBI 5.1.

14 minute read

This post details how to set up a Tomcat cluster for SpagoBI 5.1, building upon a previous guide for installing SpagoBI with MySQL (post). This configuration...

Zabbix Installation 2.4 - CentOS 6.5

1 minute read

Zabbix is the ultimate enterprise-level software designed for monitoring availability and performance of IT infrastructure components. Zabbix is open source ...

Zabbix Hadoop Monitoring.

6 minute read

This script can be used to monitor Namenode Parameters. This script can be used to Generate Zabbix Import XML or Send monitoring data to Zabbix server.

SFTP Data Collector

3 minute read

Easy way to collect files recursively over a sftp server is to connect to the server over scp and do scp -r. Problem was that the device we were connecting ...