
Deep Learning Engineer (Computer Vision & Audio Analysis)
Hubnex Labs
Bengaluru, Karnataka
•
On-site
On-Site
Full-Time
Bengaluru, Karnataka
India
About the Role
Location: Remote / Client Location
Experience: 3+ Years
Employment Type: Full-Time (Client Deployment via Hubnex Labs)
Role Overview
Join a pioneering AI team to design and deploy cutting-edge deep learning solutions for computer vision and audio analysis. You’ll leverage CNNs, Vision Transformers, attention mechanisms, and multi-modal techniques to solve complex real-world challenges in object detection, video processing, and audio classification.
Key Responsibilities
Design, develop, and optimize deep learning models for image/video analysis (object detection, segmentation) and audio classification tasks.
Implement and fine-tune CNN architectures, Vision Transformers (ViT, Swin), and attention mechanisms (SE, CBAM, self/cross-attention).
Process multi-modal data:
Video: Apply spatiotemporal modeling (3D CNNs, temporal attention)
Audio: Extract features (spectrograms, MFCCs) and build classification pipelines
Utilize pretrained models (transfer learning) and multi-task learning frameworks.
Optimize models for accuracy, speed, and robustness using PyTorch/TensorFlow.
Collaborate with MLOps teams to deploy solutions into production.
Required Skills
Programming: Advanced Python (PyTorch/TensorFlow)
Computer Vision:
Vision Transformers (ViT, Swin, DeiT)
Object detection (YOLO, SSD, Faster R-CNN, DETR)
Video analysis (temporal modeling)
Audio Processing: Feature extraction (MFCCs, spectrograms) and classification
Modeling Expertise:
Attention mechanisms (self/cross-attention, SE, CBAM)
Transfer learning and fine-tuning
Training strategies (LR scheduling, early stopping, data augmentation)
Experience handling large-scale datasets and building data pipelines.
Preferred Qualifications
Exposure to multi-modal learning (combining vision/audio/text)
Familiarity with R for statistical analysis
Publications or projects in CVPR/NeurIPS/ICML
This role is for a client of Hubnex Labs. Selected candidates will represent Hubnex while working directly with the client’s AI team.
Skills: data handling,audio processing,video analysis,tensorflow,aiml,attention mechanisms,r,computer vision,vision transformers,object detection,transfer learning,feature extraction,advanced python,pytorch
Experience: 3+ Years
Employment Type: Full-Time (Client Deployment via Hubnex Labs)
Role Overview
Join a pioneering AI team to design and deploy cutting-edge deep learning solutions for computer vision and audio analysis. You’ll leverage CNNs, Vision Transformers, attention mechanisms, and multi-modal techniques to solve complex real-world challenges in object detection, video processing, and audio classification.
Key Responsibilities
Design, develop, and optimize deep learning models for image/video analysis (object detection, segmentation) and audio classification tasks.
Implement and fine-tune CNN architectures, Vision Transformers (ViT, Swin), and attention mechanisms (SE, CBAM, self/cross-attention).
Process multi-modal data:
Video: Apply spatiotemporal modeling (3D CNNs, temporal attention)
Audio: Extract features (spectrograms, MFCCs) and build classification pipelines
Utilize pretrained models (transfer learning) and multi-task learning frameworks.
Optimize models for accuracy, speed, and robustness using PyTorch/TensorFlow.
Collaborate with MLOps teams to deploy solutions into production.
Required Skills
Programming: Advanced Python (PyTorch/TensorFlow)
Computer Vision:
Vision Transformers (ViT, Swin, DeiT)
Object detection (YOLO, SSD, Faster R-CNN, DETR)
Video analysis (temporal modeling)
Audio Processing: Feature extraction (MFCCs, spectrograms) and classification
Modeling Expertise:
Attention mechanisms (self/cross-attention, SE, CBAM)
Transfer learning and fine-tuning
Training strategies (LR scheduling, early stopping, data augmentation)
Experience handling large-scale datasets and building data pipelines.
Preferred Qualifications
Exposure to multi-modal learning (combining vision/audio/text)
Familiarity with R for statistical analysis
Publications or projects in CVPR/NeurIPS/ICML
This role is for a client of Hubnex Labs. Selected candidates will represent Hubnex while working directly with the client’s AI team.
Skills: data handling,audio processing,video analysis,tensorflow,aiml,attention mechanisms,r,computer vision,vision transformers,object detection,transfer learning,feature extraction,advanced python,pytorch