Try : Insurtech, Application Development

AgriTech(1)

Augmented Reality(20)

Clean Tech(8)

Customer Journey(17)

Design(44)

Solar Industry(8)

User Experience(67)

Edtech(10)

Events(34)

HR Tech(3)

Interviews(10)

Life@mantra(11)

Logistics(5)

Strategy(18)

Testing(9)

Android(48)

Backend(32)

Dev Ops(11)

Enterprise Solution(29)

Technology Modernization(8)

Frontend(29)

iOS(43)

Javascript(15)

AI in Insurance(38)

Insurtech(66)

Product Innovation(57)

Solutions(22)

E-health(12)

HealthTech(24)

mHealth(5)

Telehealth Care(4)

Telemedicine(5)

Artificial Intelligence(146)

Bitcoin(8)

Blockchain(19)

Cognitive Computing(7)

Computer Vision(8)

Data Science(21)

FinTech(51)

Banking(7)

Intelligent Automation(27)

Machine Learning(47)

Natural Language Processing(14)

expand Menu Filters

Vagrant: Building and maintaining portable virtual software development environment

I had a new developer joining my team. But onboarding required him to successfully install all the necessary software. The project was complex with a disparate set of software, and modules required to make all of it work seamlessly. Despite best efforts, it took the developer a couple of hours to completely set up his machine.

vagrant

It set me to think if there is something that can be done to improve and expedite this onboarding. Why should it take a new developer so much time to set up his system when the very same activity has been done a couple of times before by earlier developers.

A little bit of ‘googling’ made me stumble upon some thing called Vagrant. Perhaps I was too ignorant before, but now I realize there exists better ways to handle this problem. The activity that took our developer hours can be finished in a few minutes.

Here is how Vagrant can help you set up your development environment in minutes.

  1. Install the latest version of Vagrant from https://www.vagrantup.com/downloads.html. You can download the version for your OS. You can also read more about Vagrant from https://www.vagrantup.com/docs/getting-started/
  1. After installing Vagrant, you will need to install VirtualBox from https://www.virtualbox.org

Now that you have installed Vagrant, and the Virtual Box, lets play around a bit with it.

From your bash shell you can run the following commands

$ init hashicorp/precise64

$ vagrant up

After running the above commands, you will have a fully running Virtual Machine running Ubuntu 12.04 LTS 64 bit. You can SSH into the machine with

vagrant ssh

, and when you are done playing around with your newly created virtual machine, you may choose to destroy it by running; vagrant destroy

Next Steps

Now that you have created a virtual environment, lets see how we can get started with creating a new vagrant aware project.

New Project

Setting up a new project would require us creating a new directory, and then running the init command inside the directory.

$ mkdir new_vagrant_project

$ cd new_vagrant_project

$ vagrant init

The last init command above will place a new file Vagrantfile inside the current directory. You may also choose to convert an existing project to make it vagrant aware by running the same vagrant init command from an existing directory.

So far all you have in your directory is one single file called Vagrantfile. But where is the OS? We have not yet installed it. How will my project run in my favorite OS?

Answers to above questions lie in the VirtualBox. Virtual Box is the software, which is the container for your OS. Instead of building the virtual machine from scratch, which would be slow and tedious process as all the OS files will need to be downloaded every time, Vagrant uses a base image to quickly clone the virtual machine. These base images are called boxes in vagrant, and as Vagrant website also says “specifying the box to use for your vagrant environment is the first step after creating a new Vagrantfile”.

The virtual box type or the OS need to be specified in Vagrantfile. Below is how you can tell Vagrant that you would like to use Ubuntu Precise 64 to run your application on.

Vagrant.configure(“2”) do |config|

config.vm.box = “hashicorp/precise64”

end

Vagrant gives you a virtual environment of a server with any OS of your liking. In this example, we added Precise 64 version of the Ubuntu OS. However if you would like to add anything else, you can search for options here

https://app.terraform.io/session

Its time to bootup the virtual machine. It can be done using

vagrant up

Next we can log in to the machine by running

vagrant ssh

When you are done fiddling around with the machine, you can destroy it by running vagrant destroy.

Now that the OS is ready, its time to install necessary softwares, and other dependencies. How do we do that?
Enter Ansible!!

Ansible helps us in provisioning the virtual machine booted up in the steps above. Provisioning is nothing but configuring, and installing different dependencies required to run on your application.

Ansible (http://docs.ansible.com/ansible/index.html) can be downloaded, and installed on your machine from http://docs.ansible.com/ansible/intro_installation.html#installing-the-control-machine

Please note that Ansible is not the only provisioning tool that can work with Vagrant. Vagrant works equally well with other provisioners like Puppet, Chef, etc.

The provisioner, Ansible in the current case needs to be configured with the Vagrant so that virtual machine knows how it should provision the machine after boot up.

The basic Vagrantfile Ansible configuration looks like

Vagrant.configure(“2”) do |config|

config.vm.box = “hashicorp/precise64”

config.vm.network ‘private_network’, ip: ‘192.168.1.x’

config.vm.network ‘forward_port’, guest: xxxx, host: yyyy

config.vm.provision “ansible” do |ansible|

ansible.playbook = “playbook.yml”

end
end

The configuration ‘private_network’ will give an IP to your virtual machine so that traffic can flow from/to the virtual machine.

The ‘forward_port’ configuration enables us to specify that requests coming on a port xxxx to the virtual machine from outside will be routed inside the VM on an application listening on port yyyy.

Playbook is a very integral component of Ansible. Playbook contains instructions that Ansible will execute to ready your machine. These instructions can be a list of softwares to be downloaded, and installed, or any other configuration that your application requires to function properly. Playbooks are expressed in YAML format. Each playbook is composed of one or more ‘plays’ in a list.

The goal of a play is to map a group of hosts to some well-defined roles, represented by ‘tasks’.

Here is a playbook example with just one play.

- hosts: webservers

vars:

http_port: 80

max_clients: 200

remote_user: root

tasks:

- name: ensure apache is at the latest version

yum: name=httpd state=latest

- name: write the apache config file

template: src=/srv/httpd.j2 dest=/etc/httpd.conf

notify:

- restart apache

- name: ensure apache is running (and enable it at boot)

service: name=httpd state=started enabled=yes

handlers:

- name: restart apache

service: name=httpd state=restarted

A playbook can also have multiple plays, with each play executing on a group of servers. You can also have multiple plays in a playbook, with each play running on a different group of servers as in http://docs.ansible.com/ansible/playbooks_intro.html

In the next part of this series, I will take a real example where an application requires multiple software, and configurations, and how we make use of Vagrant & Ansible to run it in the developer’s machine, and then automate deployment to the cloud servers.

In case, you any queries on Virtualizing Your Development Environment To Make It A Replica Of Production, feel free to approach us on hello@mantralabsglobal.com, our developers are here to clear confusions and it might be a good choice based on your business and technical needs.

This guest post has been written by Parag Sharma Mantra Labs CEO.

He is an 14 year IT industry veteran with stints in companies like Zapak and RedBus before founding Mantra Labs back in 2009. Since then, Mantra has dabbled in various products and is now a niche technology solutions house for enterprises and startups.

Mantra Labs is an IT service company and the core service provided by the company are Web Development, Mobile Development, Enterprise on the Cloud, Internet of Things. The other services provided by the company are Incubate start-up, provide Pro-active solutions and are Technical Partners of Funds & Entrepreneurs.

Cancel

Knowledge thats worth delivered in your inbox

Lake, Lakehouse, or Warehouse? Picking the Perfect Data Playground

By :

In 1997, the world watched in awe as IBM’s Deep Blue, a machine designed to play chess, defeated world champion Garry Kasparov. This moment wasn’t just a milestone for technology; it was a profound demonstration of data’s potential. Deep Blue analyzed millions of structured moves to anticipate outcomes. But imagine if it had access to unstructured data—Kasparov’s interviews, emotions, and instinctive reactions. Would the game have unfolded differently?

This historic clash mirrors today’s challenge in data architectures: leveraging structured, unstructured, and hybrid data systems to stay ahead. Let’s explore the nuances between Data Warehouses, Data Lakes, and Data Lakehouses—and uncover how they empower organizations to make game-changing decisions.

Deep Blue’s triumph was rooted in its ability to process structured data—moves on the chessboard, sequences of play, and pre-defined rules. Similarly, in the business world, structured data forms the backbone of decision-making. Customer transaction histories, financial ledgers, and inventory records are the “chess moves” of enterprises, neatly organized into rows and columns, ready for analysis. But as businesses grew, so did their need for a system that could not only store this structured data but also transform it into actionable insights efficiently. This need birthed the data warehouse.

Why was Data Warehouse the Best Move on the Board?

Data warehouses act as the strategic command centers for enterprises. By employing a schema-on-write approach, they ensure data is cleaned, validated, and formatted before storage. This guarantees high accuracy and consistency, making them indispensable for industries like finance and healthcare. For instance, global banks rely on data warehouses to calculate real-time risk assessments or detect fraud—a necessity when billions of transactions are processed daily, tools like Amazon Redshift, Snowflake Data Warehouse, and Azure Data Warehouse are vital. Similarly, hospitals use them to streamline patient care by integrating records, billing, and treatment plans into unified dashboards.

The impact is evident: according to a report by Global Market Insights, the global data warehouse market is projected to reach $30.4 billion by 2025, driven by the growing demand for business intelligence and real-time analytics. Yet, much like Deep Blue’s limitations in analyzing Kasparov’s emotional state, data warehouses face challenges when encountering data that doesn’t fit neatly into predefined schemas.

The question remains—what happens when businesses need to explore data outside these structured confines? The next evolution takes us to the flexible and expansive realm of data lakes, designed to embrace unstructured chaos.

The True Depth of Data Lakes 

While structured data lays the foundation for traditional analytics, the modern business environment is far more complex, organizations today recognize the untapped potential in unstructured and semi-structured data. Social media conversations, customer reviews, IoT sensor feeds, audio recordings, and video content—these are the modern equivalents of Kasparov’s instinctive reactions and emotional expressions. They hold valuable insights but exist in forms that defy the rigid schemas of data warehouses.

Data lake is the system designed to embrace this chaos. Unlike warehouses, which demand structure upfront, data lakes operate on a schema-on-read approach, storing raw data in its native format until it’s needed for analysis. This flexibility makes data lakes ideal for capturing unstructured and semi-structured information. For example, Netflix uses data lakes to ingest billions of daily streaming logs, combining semi-structured metadata with unstructured viewing behaviors to deliver hyper-personalized recommendations. Similarly, Tesla stores vast amounts of raw sensor data from its autonomous vehicles in data lakes to train machine learning models.

However, this openness comes with challenges. Without proper governance, data lakes risk devolving into “data swamps,” where valuable insights are buried under poorly cataloged, duplicated, or irrelevant information. Forrester analysts estimate that 60%-73% of enterprise data goes unused for analytics, highlighting the governance gap in traditional lake implementations.

Is the Data Lakehouse the Best of Both Worlds?

This gap gave rise to the data lakehouse, a hybrid approach that marries the flexibility of data lakes with the structure and governance of warehouses. The lakehouse supports both structured and unstructured data, enabling real-time querying for business intelligence (BI) while also accommodating AI/ML workloads. Tools like Databricks Lakehouse and Snowflake Lakehouse integrate features like ACID transactions and unified metadata layers, ensuring data remains clean, compliant, and accessible.

Retailers, for instance, use lakehouses to analyze customer behavior in real time while simultaneously training AI models for predictive recommendations. Streaming services like Disney+ integrate structured subscriber data with unstructured viewing habits, enhancing personalization and engagement. In manufacturing, lakehouses process vast IoT sensor data alongside operational records, predicting maintenance needs and reducing downtime. According to a report by Databricks, organizations implementing lakehouse architectures have achieved up to 40% cost reductions and accelerated insights, proving their value as a future-ready data solution.

As businesses navigate this evolving data ecosystem, the choice between these architectures depends on their unique needs. Below is a comparison table highlighting the key attributes of data warehouses, data lakes, and data lakehouses:

FeatureData WarehouseData LakeData Lakehouse
Data TypeStructuredStructured, Semi-Structured, UnstructuredBoth
Schema ApproachSchema-on-WriteSchema-on-ReadBoth
Query PerformanceOptimized for BISlower; requires specialized toolsHigh performance for both BI and AI
AccessibilityEasy for analysts with SQL toolsRequires technical expertiseAccessible to both analysts and data scientists
Cost EfficiencyHighLowModerate
ScalabilityLimitedHighHigh
GovernanceStrongWeakStrong
Use CasesBI, ComplianceAI/ML, Data ExplorationReal-Time Analytics, Unified Workloads
Best Fit ForFinance, HealthcareMedia, IoT, ResearchRetail, E-commerce, Multi-Industry
Conclusion

The interplay between data warehouses, data lakes, and data lakehouses is a tale of adaptation and convergence. Just as IBM’s Deep Blue showcased the power of structured data but left questions about unstructured insights, businesses today must decide how to harness the vast potential of their data. From tools like Azure Data Lake, Amazon Redshift, and Snowflake Data Warehouse to advanced platforms like Databricks Lakehouse, the possibilities are limitless.

Ultimately, the path forward depends on an organization’s specific goals—whether optimizing BI, exploring AI/ML, or achieving unified analytics. The synergy of data engineering, data analytics, and database activity monitoring ensures that insights are not just generated but are actionable. To accelerate AI transformation journeys for evolving organizations, leveraging cutting-edge platforms like Snowflake combined with deep expertise is crucial.

At Mantra Labs, we specialize in crafting tailored data science and engineering solutions that empower businesses to achieve their analytics goals. Our experience with platforms like Snowflake and our deep domain expertise makes us the ideal partner for driving data-driven innovation and unlocking the next wave of growth for your enterprise.

Cancel

Knowledge thats worth delivered in your inbox

Loading More Posts ...
Go Top
ml floating chatbot