Try : Insurtech, Application Development

AgriTech(1)

Augmented Reality(20)

Clean Tech(9)

Customer Journey(17)

Design(45)

Solar Industry(8)

User Experience(68)

Edtech(10)

Events(34)

HR Tech(3)

Interviews(10)

Life@mantra(11)

Logistics(5)

Manufacturing(1)

Strategy(18)

Testing(9)

Android(48)

Backend(32)

Dev Ops(11)

Enterprise Solution(31)

Technology Modernization(8)

Frontend(29)

iOS(43)

Javascript(15)

AI in Insurance(38)

Insurtech(66)

Product Innovation(58)

Solutions(22)

E-health(12)

HealthTech(24)

mHealth(5)

Telehealth Care(4)

Telemedicine(5)

Artificial Intelligence(149)

Bitcoin(8)

Blockchain(19)

Cognitive Computing(7)

Computer Vision(8)

Data Science(23)

FinTech(51)

Banking(7)

Intelligent Automation(27)

Machine Learning(47)

Natural Language Processing(14)

expand Menu Filters

Model selection with cross-validation: A quest for an elite model

3 minutes, 13 seconds read

What do you call a prediction model that performs tremendously well on the same data it was trained on? Technically, a tosh! It will perform feebly on unseen data, thus leading to a state called overfitting

To combat such a scenario, the dataset is split into train set and test set. The model is then trained on the train set and is kept deprived of the test set. This test set is utilized to estimate the efficacy of the model. To decide on the best train-test split, two competing cornerstones need to be focused on. Firstly, less training data will give rise to greater variance in the parameter estimates, and secondly, less testing data will lead to greater variance in the performance statistic. Conventionally, an 80/20 split is considered to be a suitable starting point such that neither variance is too high. 

Yet another problem arises when we try to fine-tune the hyperparameters. There is a possibility for the model to still overfit on the testing data due to data leakage. To prevent this, a dataset should typically be divided into train, validation, and test sets. The validation set acts as an intermediary between the training part and the final evaluation part. However, this indeed reduces the training examples, thus making it less likely for the model to generalize, and the performance rather depends merely on a random split. 

Here’s where cross-validation comes to our rescue!

Cross-validation (CV) eliminates the explicit requirement of a validation set. It facilitates the model selection and aids in gauging the generalizing capability of a model. The rudimentary modus operandi is the k-fold CV, where the dataset is split into k groups/folds and k-1 folds are used to train the model, while the held out kth fold is used to validate the model. Henceforth, each fold gets an opportunity to be used as a test set. This way, in each fold, the evaluation score is retained and the model is then discarded. The model’s skill is summarised by the mean of the evaluation scores. The variance of the evaluated scores is often expressed in terms of standard deviation.

5-fold cross validation

But is it feasible when the dataset is imbalanced? 

Probably not! In case of imbalanced data an extension to k-fold CV, called Stratified k-fold CV proves to be the magic bullet. It maintains the class proportion in all the folds as it was in the original dataset, thus making it available for the model to train on both, the minority as well as majority classes. 

stratified 5-fold cross validation

Determining the value of k

This is a baffling concern though!  Taking into account the bias-variance trade-off, the value of k should be decided carefully. Consequently, the k value should be chosen such that each fold can act as a representative of the dataset. Jumping on the bandwagon, it is preferred to set the k value as 5 or 10 since experimental success is observed with these values. 

There are some other variations of cross-validation viz.,

  1. Leave One Out CV (LOOCV): Only one sample is held out for the validation part
  2. Leave P Out CV (LPOCV): Similar to LOOCV, P samples are held out for the validation part
  3. Nested CV: Each fold involves cross-validation, making it a double cross-validation. It is generally used when tuning hyperparameters

Finally yet importantly, some tidbits that shouldn’t be ignored:

  • It is important to shuffle the data before moving ahead with cross-validation
  • To avoid data leakage, any data preparation step should be carried out on the training data within the cross-validation loop
  • It is preferable to repeat the cross-validation procedure by using repeated k-fold or repeated stratified k-fold CV for more reliable results especially, the variance in the performance metrics. 

Voila! We finally made it! If the model evaluation scores are acceptably high and have low variance, it’s time to party hard! Our mojo has worked! 

Further Readings:

  1.  5 Proven Strategies to Break Through the Data Silos
  2. Speech is the next UX
  3. The Next Big Thing for Big Tech: AI as a Service
  4. Insurtechs are Thriving with Machine Learning. Here’s how.

Cancel

Knowledge thats worth delivered in your inbox

Smart Machines & Smarter Humans: AI in the Manufacturing Industry

We have all witnessed Industrial Revolutions reshape manufacturing, not just once, but multiple times throughout history. Yet perhaps “revolution” isn’t quite the right word. These were transitions, careful orchestrations of human adaptation, and technological advancement. From hand production to machine tools, from steam power to assembly lines, each transition proved something remarkable: as machines evolved, human capabilities expanded rather than diminished.

Take the First Industrial Revolution, where the shift from manual production to machinery didn’t replace craftsmen, it transformed them into skilled machine operators. The steam engine didn’t eliminate jobs; it created entirely new categories of work. When chemical manufacturing processes emerged, they didn’t displace workers; they birthed manufacturing job roles. With each advancement, the workforce didn’t shrink—it evolved, adapted, and ultimately thrived.

Today, we’re witnessing another manufacturing transformation on factory floors worldwide. But unlike the mechanical transformations of the past, this one is digital, driven by artificial intelligence(AI) working alongside human expertise. Just as our predecessors didn’t simply survive the mechanical revolution but mastered it, today’s workforce isn’t being replaced by AI in manufacturing,  they’re becoming AI conductors, orchestrating a symphony of smart machines, industrial IoT (IIoT), and intelligent automation that amplify human productivity in ways the steam engine’s inventors could never have imagined.

Let’s explore how this new breed of human-AI collaboration is reshaping manufacturing, making work not just smarter, but fundamentally more human. 

Tools and Techniques Enhancing Workforce Productivity

1. Augmented Reality: Bringing Instructions to Life

AI-powered augmented reality (AR) is revolutionizing assembly lines, equipment, and maintenance on factory floors. Imagine a technician troubleshooting complex machinery while wearing AR glasses that overlay real-time instructions. Microsoft HoloLens merges physical environments with AI-driven digital overlays, providing immersive step-by-step guidance. Meanwhile, PTC Vuforia’s AR solutions offer comprehensive real-time guidance and expert support by visualizing machine components and manufacturing processes. Ford’s AI-driven AR applications of HoloLens have cut design errors and improved assembly efficiency, making smart manufacturing more precise and faster.

2. Vision-Based Quality Control: Flawless Production Lines

Identifying minute defects on fast-moving production lines is nearly impossible for the human eye, but AI-driven computer vision systems are revolutionizing quality control in manufacturing. Landing AI customizes AI defect detection models to identify irregularities unique to a factory’s production environment, while Cognex’s high-speed image recognition solutions achieve up to 99.9% defect detection accuracy. With these AI-powered quality control tools, manufacturers have reduced inspection time by 70%, improving the overall product quality without halting production lines.

3. Digital Twins: Simulating the Factory in Real Time

Digital twins—virtual replicas of physical assets are transforming real-time monitoring and operational efficiency. Siemens MindSphere provides a cloud-based AI platform that connects factory equipment for real-time data analytics and actionable insights. GE Digital’s Predix enables predictive maintenance by simulating different scenarios to identify potential failures before they happen. By leveraging AI-driven digital twins, industries have reported a 20% reduction in downtime, with the global digital twin market projected to grow at a CAGR of 61.3% by 2028

4. Human-Machine Interfaces: Intuitive Control Panels

Traditional control panels are being replaced by intuitive AI-powered human-machine interfaces (HMIs) which simplify machine operations and predictive maintenance. Rockwell Automation’s FactoryTalk uses AI analytics to provide real-time performance analytics, allowing operators to anticipate machine malfunctions and optimize operations. Schneider Electric’s EcoStruxure incorporates predictive analytics to simplify maintenance schedules and improve decision-making.

5. Generative AI: Crafting Smarter Factory Layouts

Generative AI is transforming factory layout planning by turning it into a data-driven process. Autodesk Fusion 360 Generative Design evaluates thousands of layout configurations to determine the best possible arrangement based on production constraints. This allows manufacturers to visualize and select the most efficient setup, which has led to a 40% improvement in space utilization and a 25% reduction in material waste. By simulating layouts, manufacturers can boost productivity, efficiency and worker safety.

6. Wearable AI Devices: Hands-Free Assistance

Wearable AI devices are becoming essential tools for enhancing worker safety and efficiency on the factory floor. DAQRI smart helmets provide workers with real-time information and alerts, while RealWear HMT-1 offers voice-controlled access to data and maintenance instructions. These AI-integrated wearable devices are transforming the way workers interact with machinery, boosting productivity by 20% and reducing machine downtime by 25%.

7. Conversational AI: Simplifying Operations with Voice Commands

Conversational AI is simplifying factory operations with natural language processing (NLP), allowing workers to request updates, check machine status, and adjust schedules using voice commands. IBM Watson Assistant and AWS AI services make these interactions seamless by providing real-time insights. Factories have seen a reduction in response time for operational queries thanks to these tools, with IBM Watson helping streamline machine monitoring and decision-making processes.

Conclusion: The Future of Manufacturing Is Here

Every industrial revolution has sparked the same fear, machines will take over. But history tells a different story. With every technological leap, humans haven’t been replaced; they’ve adapted, evolved, and found new ways to work smarter. AI is no different. It’s not here to take over; it’s here to assist, making factories faster, safer, and more productive than ever.

From AR-powered guidance to AI-driven quality control, the factory floor is no longer just about machinery, it’s about collaboration between human expertise and intelligent systems. And at Mantra Labs, we’re diving deep into this transformation, helping businesses unlock the true potential of AI in manufacturing.

Want to see how AI-powered Augmented Reality is revolutionizing the manufacturing industry? Stay tuned for our next blog, where we’ll explore how AI in AR is reshaping assembly, troubleshooting, and worker training—one digital overlay at a time.

Cancel

Knowledge thats worth delivered in your inbox

Loading More Posts ...
Go Top
ml floating chatbot